The fast-moco from orashi

fast-moco's Issues

unable to reproduce imagenet linear classification

Hi, thanks for your amazing work and excellently written codebase.

TL;DR: We repeated our runs two times and we achieved an accuracy of 73.08% (attached logfile - baseline_log.txt) and 73.11% when the linear head is trained with Imagenet. However, we observe that in the main paper the performance is reported at 73.5% which is 0.4% higher and is significant on the scale of the Imagenet valset.

At the same time, the self-supervised model checkpoint shared on github [here (https://drive.google.com/file/d/12ZEKiUg8ep2LgX5cJbEELmPHJRcFEMrR/view) does reproduce the said result.

So we suspect that there can be three possibilities leading to irreproducible results:
1) either there is an issue in the way we train our self-supervised model
or 2) there is an issue in training the linear head.
or 3) Was there a variance of 0.5% in performance if we simple repeat self-supervised training?

The second issue in training the linear classifier looks unlikely the reason because with the exact same settings we were able to achieve 73.51% if we start from the checkpoint provided on GitHub (corresponding to 100 epochs). We also provide the log file for reference- baseline_github_log.txt

Another reason, we can think is that it's possible that there is a difference in the way we train our self-supervised model. We share our training logs for your reference - log_self_sup.txt

Kindly let us know by comparing our logs to your self-sup training logs (which you used for reporting the results on the paper) if there is any major change.

Looking forward to a positive reply asap, and, once again thanks for your amazing work :)

baseline_github_log.txt
log_self_sup.txt
baseline_log.txt

renet-18

Hi there,
i am just wondering whether you have tested fast-moco using lighter networks such as resnet-18 or efficientnet

Did you try this on mocov2?

Details of training time

Hi, @orashi. Thanks for your outstanding work! I really appreciate this public code.

Could you please also release the training time for the model and the details of related devices? Thank you.

question about adamw optimizer

Hi,

I noticed in the Appendix that you also tried to pretrain with adamw optimizer. It achieves good results with only 100epoch, what is the training hyper-parameters of adamw(lr, wd, batchsize or so)?

Use additional global view based on combined view

Thanks for your great work. It is an very interesting perspective for SSL. I just have a question that have you ever tried to add the additional global view to generate "p"? I mean besides the 6 local views, also generate a global view as conventionl, which leads to 7 total views. Would the performance be improved? I am very curious about this. I would appreciate it if you would share your opinion. Thanks a lot for your help!

mc=0.0

Hello, is mc=0.0 an internal package? Can you share it?

orashi / fast-moco Goto Github PK

fast-moco's People

Contributors

Stargazers

Watchers

fast-moco's Issues

unable to reproduce imagenet linear classification

renet-18

Did you try this on mocov2?

Details of training time

question about adamw optimizer

Use additional global view based on combined view

mc=0.0

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent