Giter Site home page Giter Site logo

Comments (6)

zhangchbin avatar zhangchbin commented on May 23, 2024

Hi, @Kurumi233
Our paper is under review. So the code will be public later. By the way, we do not use synchronization batch normalization.
Our code is heavily borrowed from CutMix-Pytorch. And you can refer to this code repo. There are some points that you'd better know. Firstly, we only accumulate the correctly predicted samples, and we only regard the accumulated soft labels during the previous one epoch as the ground-truth soft label. Our all experiments are running on 4 RTX 2080Ti GPUs, and the batch size is 64x4, weight decay is 1e-4, SGD with momentum optimizer. The initial learning rate is set to 0.1, and it decays at epoch 75, 150 and 225, respectively. I think there may be some difference between your re-implementation and our implementation. Please feel it easy to contact me by E-mail: zhangchbin Dot gmail.com (we can chat on the WECHAT.)

from onlinelabelsmoothing.

Kurumi233 avatar Kurumi233 commented on May 23, 2024

Hi, @Kurumi233
Our paper is under review. So the code will be public later. By the way, we do not use synchronization batch normalization.
Our code is heavily borrowed from CutMix-Pytorch. And you can refer to this code repo. There are some points that you'd better know. Firstly, we only accumulate the correctly predicted samples, and we only regard the accumulated soft labels during the previous one epoch as the ground-truth soft label. Our all experiments are running on 4 RTX 2080Ti GPUs, and the batch size is 64x4, weight decay is 1e-4, SGD with momentum optimizer. The initial learning rate is set to 0.1, and it decays at epoch 75, 150 and 225, respectively. I think there may be some difference between your re-implementation and our implementation. Please feel it easy to contact me by E-mail: zhangchbin Dot gmail.com (we can chat on the WECHAT.)

OK, thanks. I will check it.

from onlinelabelsmoothing.

zhangchbin avatar zhangchbin commented on May 23, 2024

Hi, @Kurumi233
Our paper is under review. So the code will be public later. By the way, we do not use synchronization batch normalization.
Our code is heavily borrowed from CutMix-Pytorch. And you can refer to this code repo. There are some points that you'd better know. Firstly, we only accumulate the correctly predicted samples, and we only regard the accumulated soft labels during the previous one epoch as the ground-truth soft label. Our all experiments are running on 4 RTX 2080Ti GPUs, and the batch size is 64x4, weight decay is 1e-4, SGD with momentum optimizer. The initial learning rate is set to 0.1, and it decays at epoch 75, 150 and 225, respectively. I think there may be some difference between your re-implementation and our implementation. Please feel it easy to contact me by E-mail: zhangchbin Dot gmail.com (we can chat on the WECHAT.)

OK, thanks. I will check it.

I am glad to help you re-implement it.

from onlinelabelsmoothing.

Kurumi233 avatar Kurumi233 commented on May 23, 2024

Hello, now I get the top1 acc 77.67% on ImageNet using Resnet50. I modified the weight-decay from 5e-4 to 1e-4 and changed the batch size to 256. I used to think the batch size does not need to times numbers of GPU when training without sync-bn. The result seems to be within the margin of error. The code has been released to my git repository.

But I still have an expectation. The first time I said "synchronization" doesn't means sync-bn, I think the soft label also needs to consider synchronization when training with multi GPUs. I use a single 32GB GPU which seems to avoid this problem, So I hope you can consider it in your code. Thank you.

from onlinelabelsmoothing.

zhangchbin avatar zhangchbin commented on May 23, 2024

Hi, @Kurumi233
Thanks for your effort and reminder. Firstly, we built the code using DataParallel in PyTorch and got the initial performance of ResNet-50 reported in our paper. Then we re-built the code using DDP in NVIDIA-Apex. Thanks for your kind reminder again. We do not consider the synchronization in our DDP version. We will fix this bug. Note that we have not tried other epochs like 100 (30, 60, 90), because we refer to CutMix. By the way, our method can also improve the performance incorporated with data augmentation methods like CutOut, Mixup and CutMix. Especially, "CutOut + ols" can get the same accuracy with "CutMix".
And we will attach your repo link in our README.

from onlinelabelsmoothing.

Kurumi233 avatar Kurumi233 commented on May 23, 2024

@zhangchbin
I am glad that you can accept my suggestion.
I think simple is the best, so I think your work is very interesting and I am happy to do this.

from onlinelabelsmoothing.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.