Giter Site home page Giter Site logo

mislas's People

Contributors

zs-zhong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mislas's Issues

Have you tried 90 epochs training with mixup on ImageNet or iNaturalist ?

Hi @zs-zhong ,

Have you tried 90 epochs training with mixup on ImageNet or iNaturalist ?

I have made some improvements based on your work, but due to the lack of computing resources, training a model for 180/200 epochs is too time-consuming for me, especially for iNaturalist.

In my reproduction, under the condition of training 90 epochs with mixup (alpha 0.2) on ImageNet-LT, epochs of stage-2 is 10, the accuracy of methods with ResNet-50 are as follows:

Stage-1 mixup Stage-2 cRT LWS
Reported in Decouple 90 epochs 10 epochs 47.3 47.7
My Reproduce 90 epochs 10 epochs 48.7 49.3
My Reproduce 90 epochs 10 epochs 47.6 47.4
My Reproduce 180 epochs 10 epochs 51.0 51.8
Reported in MiSLAS 180 epochs 10 epochs 50.3 51.2
Reported in MiSLAS 180 epochs 10 epochs 51.7 52.0

They look much worse than the model trained for 180 epochs with mixup, and it does not even have improvement compared to normal training.

I guess this is because mixup could be regarded as a regularization method, which requires longer training epochs, 90 epochs cannot make the network converge.

However, I cannot get the result of using mixup to train 90 epochs on the iNaturalist data set, because the iNaturalist data set is too large and I can't put it in the memory, which makes it take about a week for me to train R50 once.

If possible, could you please provide the pre-trained ResNet-50 model for training 90 epochs with mixup on iNaturalist? I believe this will also be beneficial for fair comparison of future work.

Thank you again for your contribution and look forward to your reply.

Question about effect of shifted BN

Hi! Thanks for the great work. In issue#2, you mentioned that LWS fix the affine part(alpha, beta in the paper, as far as I understand) and update the running means and variances in Stage-2. Then I understand that LWS also uses shifted BN, however, in figure 4 there are differences in ACC, ECE between mixup+LWS and mixup+LWS+shifted BN.

What makes improvement in that experiment? Is there anything wrong with what I understand?

Access to models is limited in google drive

Hello Zhisheng,

The access to the models is restricted by Google Drive (picture below, in French, translated below the picture). Could you make the models accessible to everyone?

PS: I may have sent you access requests, sorry about that.

Robin

2021-06-02T12:12:37+02:00

Authorization is required
You need to request owner access or sign in with an account that has the necessary permissions. Find out more

Shift Learning implementation

Hi, thanks for your works. However, in your paper, the implementation of shift learning has not been described detail.

I guess that the BN parameters are re-trained in Stage-II, since the different means and variances. Is that true?

When will the code be released?

Hi! Thank you for such an inspiring work! Do you have any plan of releasing your code? I'm looking forward to that.

Plus, I have a small question regarding the method. In the paper you mentioned that when applying mixup in stage 2 yields no obvious improvement, but I cannot find a description of your overall method and I'd like to know in your final framework whether you use mixup in stage 1 only or in both stage 1&2. Thanks again!

Regarding the test accuracy

I hope I am not wrong. In the code I am seeing that you are calculating test accuracy after every few training iterations and taking the max of them.
My question was

  1. The results reported in the paper, Are they the maximum accuracy or the final accuracy after all the iterations?.
  2. Is the validation same as test in cifar ?

Where is the implementation of updating 'the running mean and variance'?

Hello Dr.Zhong, thank you for your excellent work. I'm very interested in what you mentioned in section 3.3

we update the running mean μ and variance σ and yet fix the learnable linear transformation parameters α and β for better normalization in Stage-2.

But, I cannot find the implementation in your code. If you are available, can you tell me the exact location?

Wish you good health and success in your studies!

About the BN part

Hi, thanks for your great work. I am wondering about the BN part, it seems that the methods like "cRT" and "DRW" do update the running mean and variances, right? I can not find the code segment which aims to freeze this part.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.