dvlab-research / mislas Goto Github PK

View Code? Open in Web Editor NEW

143.0 4.0 25.0 14.58 MB

Improving Calibration for Long-Tailed Recognition (CVPR2021)

License: MIT License

Python 100.00%

long-tailed-recognition confidence-calibration

mislas's People

Contributors

Stargazers

Watchers

mislas's Issues

Did you use class-balance sampler in stage-2?

Hi @zs-zhong , I was wondering weather you used class-balanced sampler in stage-2 ?

Have you tried 90 epochs training with mixup on ImageNet or iNaturalist ?

Hi @zs-zhong ,

Have you tried 90 epochs training with mixup on ImageNet or iNaturalist ?

I have made some improvements based on your work, but due to the lack of computing resources, training a model for 180/200 epochs is too time-consuming for me, especially for iNaturalist.

In my reproduction, under the condition of training 90 epochs with mixup (alpha 0.2) on ImageNet-LT, epochs of stage-2 is 10, the accuracy of methods with ResNet-50 are as follows:

	Stage-1	mixup	Stage-2	cRT	LWS
Reported in Decouple	90 epochs		10 epochs	47.3	47.7
My Reproduce	90 epochs		10 epochs	48.7	49.3
My Reproduce	90 epochs	✅	10 epochs	47.6	47.4
My Reproduce	180 epochs		10 epochs	51.0	51.8
Reported in MiSLAS	180 epochs		10 epochs	50.3	51.2
Reported in MiSLAS	180 epochs	✅	10 epochs	51.7	52.0

They look much worse than the model trained for 180 epochs with mixup, and it does not even have improvement compared to normal training.

I guess this is because mixup could be regarded as a regularization method, which requires longer training epochs, 90 epochs cannot make the network converge.

However, I cannot get the result of using mixup to train 90 epochs on the iNaturalist data set, because the iNaturalist data set is too large and I can't put it in the memory, which makes it take about a week for me to train R50 once.

If possible, could you please provide the pre-trained ResNet-50 model for training 90 epochs with mixup on iNaturalist? I believe this will also be beneficial for fair comparison of future work.

Thank you again for your contribution and look forward to your reply.

Can you please provide the code for drawing Figure 1?

Hi Zhisheng,
Thanks for your great work! Figure 1 in your paper is impressive, can you please provide the code for drawing this figure?

Question about effect of shifted BN

Hi! Thanks for the great work. In issue#2, you mentioned that LWS fix the affine part(alpha, beta in the paper, as far as I understand) and update the running means and variances in Stage-2. Then I understand that LWS also uses shifted BN, however, in figure 4 there are differences in ACC, ECE between mixup+LWS and mixup+LWS+shifted BN.

What makes improvement in that experiment? Is there anything wrong with what I understand?

Can you please provide the code for drawing Figure 2?

Hi @zs-zhong ,

Thanks for your great work! Figure 2 in your paper is also impressive, could you please provide the code for drawing this figure?

Access to models is limited in google drive

Hello Zhisheng,

The access to the models is restricted by Google Drive (picture below, in French, translated below the picture). Could you make the models accessible to everyone?

PS: I may have sent you access requests, sorry about that.

Robin

Authorization is required
You need to request owner access or sign in with an account that has the necessary permissions. Find out more

Question about label-aware smoothing

Hello:
In the paper, I think you mean nll_loss is only for the gt label and smooth_loss is for the remaining K-1 label.
But in the code
https://github.com/Jia-Research-Lab/MiSLAS/blob/e8f91e59a910c5543ea1bcabb955ba368c606a00/methods.py#L62
I think you still contain the gt label in the smooth_loss.
I am confusing about this.

Shift Learning implementation

Hi, thanks for your works. However, in your paper, the implementation of shift learning has not been described detail.

I guess that the BN parameters are re-trained in Stage-II, since the different means and variances. Is that true?

When will the code be released?

Hi! Thank you for such an inspiring work! Do you have any plan of releasing your code? I'm looking forward to that.

Plus, I have a small question regarding the method. In the paper you mentioned that when applying mixup in stage 2 yields no obvious improvement, but I cannot find a description of your overall method and I'd like to know in your final framework whether you use mixup in stage 1 only or in both stage 1&2. Thanks again!

Regarding the test accuracy

I hope I am not wrong. In the code I am seeing that you are calculating test accuracy after every few training iterations and taking the max of them.
My question was

The results reported in the paper, Are they the maximum accuracy or the final accuracy after all the iterations?.
Is the validation same as test in cifar ?

Where is the implementation of updating 'the running mean and variance'?

Hello Dr.Zhong, thank you for your excellent work. I'm very interested in what you mentioned in section 3.3

we update the running mean μ and variance σ and yet fix the learnable linear transformation parameters α and β for better normalization in Stage-2.

But, I cannot find the implementation in your code. If you are available, can you tell me the exact location?

Wish you good health and success in your studies!

About the BN part

Hi, thanks for your great work. I am wondering about the BN part, it seems that the methods like "cRT" and "DRW" do update the running mean and variances, right? I can not find the code segment which aims to freeze this part.

dvlab-research / mislas Goto Github PK

mislas's People

Contributors

Stargazers

Watchers

Forkers

mislas's Issues

Recommend Projects

Recommend Topics

Recommend Org