Giter Site home page Giter Site logo

shjo-april / puzzlecam Goto Github PK

View Code? Open in Web Editor NEW
169.0 169.0 46.0 17 MB

[ICIP 2021] Puzzle-CAM: Improved localization via matching partial and full features.

Python 100.00%
convolutional-neural-networks deeplearning semantic-segmentation weakly-supervised-learning weakly-supervised-segmentation

puzzlecam's People

Contributors

shjo-april avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

puzzlecam's Issues

Question About DataAugment ablation

Greetings,

Thank you for your code and bravo paper, they give us a novel way to process the relationship with most descriptive part and the other object part.
When I read your code, I find some question, would you have time to answer them for me? Thanks!

I noticed that you apply color jitter and random augment in your training step, I am interested about how much the development could them provide, did you do the ablation study of these?

ModuleNotFoundError: No module named 'core.sync_batchnorm'

`

ModuleNotFoundError Traceback (most recent call last)
in
1 from core.puzzle_utils import *
----> 2 from core.networks import *
3 from core.datasets import *
4
5 from tools.general.io_utils import *

/working/PuzzleCAM/core/networks.py in
24 # Normalization
25 #######################################################################
---> 26 from .sync_batchnorm.batchnorm import SynchronizedBatchNorm2d
27
28 class FixedBatchNorm(nn.BatchNorm2d):

ModuleNotFoundError: No module named 'core.sync_batchnorm'
`

The Training problem

Hello, I would like to ask a question, when I train train_classification_with_puzzle.py, why my mIOU is always 4.23 and can’t improve? I used two 2080Ti for training.
Like this
[i] iteration=66, learning_rate=0.0994, alpha=0.03, loss=1.2468, class_loss=0.6384, p_class_loss=0.6042, re_loss=0.3179, conf_loss=0.0000, time=65sec
[i] iteration=132, learning_rate=0.0988, alpha=0.08, loss=0.5721, class_loss=0.2831, p_class_loss=0.2832, re_loss=0.0733, conf_loss=0.0000, time=54sec
[i] iteration=198, learning_rate=0.0982, alpha=0.13, loss=0.5701, class_loss=0.2822, p_class_loss=0.2793, re_loss=0.0651, conf_loss=0.0000, time=54sec
[i] iteration=264, learning_rate=0.0976, alpha=0.19, loss=0.5488, class_loss=0.2696, p_class_loss=0.2712, re_loss=0.0434, conf_loss=0.0000, time=54sec
[i] iteration=330, learning_rate=0.0970, alpha=0.24, loss=0.5321, class_loss=0.2615, p_class_loss=0.2606, re_loss=0.0416, conf_loss=0.0000, time=54sec
[i] iteration=396, learning_rate=0.0964, alpha=0.29, loss=0.5358, class_loss=0.2632, p_class_loss=0.2591, re_loss=0.0462, conf_loss=0.0000, time=53sec
[i] iteration=462, learning_rate=0.0958, alpha=0.35, loss=0.5387, class_loss=0.2635, p_class_loss=0.2608, re_loss=0.0417, conf_loss=0.0000, time=54sec
[i] iteration=528, learning_rate=0.0952, alpha=0.40, loss=0.5292, class_loss=0.2578, p_class_loss=0.2573, re_loss=0.0351, conf_loss=0.0000, time=54sec
[i] iteration=594, learning_rate=0.0946, alpha=0.45, loss=0.5279, class_loss=0.2573, p_class_loss=0.2545, re_loss=0.0356, conf_loss=0.0000, time=53sec
[i] iteration=660, learning_rate=0.0940, alpha=0.51, loss=0.5173, class_loss=0.2518, p_class_loss=0.2496, re_loss=0.0313, conf_loss=0.0000, time=53sec
[i] save model
[i] iteration=661, threshold=0.10, train_mIoU=4.23%, best_train_mIoU=4.23%, time=28sec
[i] iteration=726, learning_rate=0.0934, alpha=0.56, loss=0.5259, class_loss=0.2554, p_class_loss=0.2537, re_loss=0.0303, conf_loss=0.0000, time=83sec
[i] iteration=792, learning_rate=0.0928, alpha=0.61, loss=0.5114, class_loss=0.2484, p_class_loss=0.2483, re_loss=0.0241, conf_loss=0.0000, time=53sec
[i] iteration=858, learning_rate=0.0922, alpha=0.67, loss=0.5194, class_loss=0.2523, p_class_loss=0.2526, re_loss=0.0219, conf_loss=0.0000, time=53sec
[i] iteration=924, learning_rate=0.0916, alpha=0.72, loss=0.5110, class_loss=0.2479, p_class_loss=0.2472, re_loss=0.0221, conf_loss=0.0000, time=53sec
[i] iteration=990, learning_rate=0.0910, alpha=0.77, loss=0.5185, class_loss=0.2515, p_class_loss=0.2500, re_loss=0.0220, conf_loss=0.0000, time=53sec
[i] iteration=1,056, learning_rate=0.0904, alpha=0.83, loss=0.5102, class_loss=0.2470, p_class_loss=0.2465, re_loss=0.0202, conf_loss=0.0000, time=53sec
[i] iteration=1,122, learning_rate=0.0898, alpha=0.88, loss=0.5295, class_loss=0.2542, p_class_loss=0.2514, re_loss=0.0271, conf_loss=0.0000, time=53sec
[i] iteration=1,188, learning_rate=0.0892, alpha=0.93, loss=0.5289, class_loss=0.2525, p_class_loss=0.2503, re_loss=0.0280, conf_loss=0.0000, time=53sec
[i] iteration=1,254, learning_rate=0.0886, alpha=0.98, loss=0.5302, class_loss=0.2542, p_class_loss=0.2531, re_loss=0.0233, conf_loss=0.0000, time=53sec
[i] iteration=1,320, learning_rate=0.0879, alpha=1.04, loss=0.5196, class_loss=0.2501, p_class_loss=0.2486, re_loss=0.0201, conf_loss=0.0000, time=53sec
[i] iteration=1,322, threshold=0.10, train_mIoU=4.19%, best_train_mIoU=4.23%, time=29sec
[i] iteration=1,386, learning_rate=0.0873, alpha=1.09, loss=0.5252, class_loss=0.2533, p_class_loss=0.2521, re_loss=0.0182, conf_loss=0.0000, time=84sec

hyperparameter 'alpha' for affinityNet

Hello Author,

Thanks for sharing great works!

I have a question about how to set hyperparameter alpha for AffinityNet. In AffinityNet, there's a parameter 'alpha' that adjusts the background confidence scores, also described in equation (2) in the paper.

For your experiment setting, I'm wondering what alpha value you used.

Training Logs

Dear Sanghyun Jo,
I was wondering if you were able to share training logs with your final parameters, losses and mious?
Thanks, Alex

EDIT:
Also I would like to ask you, if you think its reasonable for training and validation if I rely entirely on the loss, as I have no ground truth masks for validation. For this purpose I calculate the "raw loss" additionally without multiplying RE loss with alpha, both on the train and validation set (because otherwise my loss metric would be influenced by the number of epochs).

Weights ?

Hello ! Thanks you for this wonderful paper and code, are you going to release the saved weights you trained ?

Evaluation in classifier training is using supervised segmentation maps?

Hello, thank you for the great repository! It's pretty impressive how organized it is.

I have a critic (or maybe a question, in case I got it wrong) regarding the training of the classifier, though:
I understand the importance of measuring and logging the mIoU during training (specially when creating the ablation section in your paper), however it doesn't strike me as correct to save the model with best mIoU. This procedural decision is based on fully supervised segmentation information, which should not be available for a truly weakly supervised problem; while resulting in a model better suited for segmentation.
The paper doesn't address this. Am I right to assume all models were trained like this? Were there any trainings where other metrics were considered when saving the model (e.g. classification loss or Eq (7) in the paper)?

Regarding the fg_threshold and bg_threshold in the code of make_Affinity_labels

Hello,
I want to know how you calculated the fg_threshold and bg_threshold in the code of make_affinity_labels, correct me if I am wrong, after running inference_classification.py, in the end, we will have a comment to run for evaluate.py which gives some threshold value, I am thinking that the threshold is for bg_threshold. Then how come you got the fg_threshold value, if you provide a explain regarding the calculation of these thresholds it would be helpful.
Thank you,
Avinash.

about training procedure

Can you give me the details of how to decide the threshold values? I think there is a problem with this procedure or I missed some parts.
Thanks

error occured when image-size isn't 512 * n

dear author:
I notice that if the image size isn't 512 x 512, it will have some error. I use image size 1280 x 496 and i got tensor size error at calculate puzzle module:the original feature is 31 dims and re_feature is 32 dims. So i have to change image size to 1280 x 512 and i work.
So i think this maybe a little bug. It will better that you fixed it or add a notes in code~
Thanks for your job!

Ask for details of the training process!

I am trying to train with ResNest101, and I also added affinity and RW.
When I try to train, it runs according to the specified code. It is found that the obtained affinity labels are not effective, and the effect of pseudo_labels is almost invisible, which is close to the effect of all black. I don't know where the problem is, who can explain the details. help!

performance issue

When I used the released weights for inference phase and evaluation, I found that the mIoU I got was different from the mIoU reported in the paper. I would like to ask whether this weight is corresponding to the paper, if it is, how to reproduce the result in your paper. Looking forward to your reply.

PuzzleCAM
PuzzleCAM2

[Question] conf loss not mentionned in paper ?

In you're code their is a conf loss (ShannonEntropyLoss) applied on the tiled logits. This is not mentionned in your paper, any reason for that ? What's the idea behind it ?

Thanks again, love your work/paper !

about train details

i run the train classification.py with baseline resnet50,and only get the best train_miou 44.12%. which is lower than your result in paper 47.82%. i used four nvidia 1080ti, conld you tell me the experiments details, thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.