facebookresearch / msn Goto Github PK
View Code? Open in Web Editor NEWMasked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)
License: Other
Masked Siamese Networks for Label-Efficient Learning (https://arxiv.org/abs/2204.07141)
License: Other
How to install the right cyanure version?
can help add vit-small-8 and vit-base-4 config file?
I used the default vits-16 config to train on Imagenet1k end to end. But I found that the loss has converged to 2.492 after one epoch. Is that normal?
And if so, how does the preformance improve since the loss seems not decrease any more in the next hundreds epoch?
And if not, is there anything I did wrong? the config I used is as follows
criterion: ent_weight: 0.0 final_sharpen: 0.25 me_max: true memax_weight: 1.0 num_proto: 1024 start_sharpen: 0.25 temperature: 0.1 batch_size: 32 use_ent: true use_sinkhorn: true data: color_jitter_strength: 0.5 pin_mem: true num_workers: 10 image_folder: /gruntdata6/xinshulin/data/imagenet/new_train/1 label_smoothing: 0.0 patch_drop: 0.15 rand_size: 224 focal_size: 96 rand_views: 1 focal_views: 10 root_path: /gruntdata6/xinshulin/data/imagenet/new_train logging: folder: checkpoint/msn_os_logs4/ write_tag: msn-experiment-1 meta: bottleneck: 1 copy_data: false drop_path_rate: 0.0 hidden_dim: 2048 load_checkpoint: false model_name: deit_small output_dim: 256 read_checkpoint: null use_bn: true use_fp16: false use_pred_head: false optimization: clip_grad: 3.0 epochs: 800 final_lr: 1.0e-06 final_weight_decay: 0.4 lr: 0.001 start_lr: 0.0002 warmup: 15 weight_decay: 0.04
Hi, amazing work. Any thoughts on releasing the code under a more permissive license?
The pretrained weights seem to be wrong.
For example, the vit_base has a dimension of 1024.
Could you upload the correct version? Thanks
I am confused that you take a random mask or a focal mask, but why not try a block-wise one which has been proven effective.
Hi,
Thanks for contributing such awesome work! Could you please release the vit-b-16 config file?
Thanks,
Ziyu Jiang
Hello,the interfaces of the current main branch of cyanure are different from what you used (e.g. there is no multiclassifier function).
I greatly enjoyed reading your paper but I'm curious about the segmentation performance of such models. Do MSNs share the same segmentation properties as DINO?
Hi, very interesting work! How was the lambda regularization parameter for logistic regression set for the results obtained in tables 1 and 2 in the paper? I get subpar performance values in some cases when I set it fixed to 0.075 or 0.0025 for all models (as mentioned in the repo and in another issue). Did you set it to a fixed value for all the models and subsets? If so, can you share this lambda value? Or did you do a sweep over a set of values and choose the best values? If so, can you share the set of lambdas that you did the search over? Thanks in advance!
Hi, thank you for providing a awesome self-supervised learning research!
I'm wonder that how the performance will decrease when we use small batch size like between 128 ~ 512 for DEIT base model.
If we cannot use large batch size (ex: 1024) on base model, is it better to use smaller model with large batch size?
Thanks in advance!!
Hi, fantastic work!Thank you for your code!
There seems to be a little omission on line 327 of 'linear_eval.py'.
linear_eval.py #L327
return encoder, opt, sched, epoch, best_acc
should be
return encoder, linear_classifier, opt, sched, epoch, best_acc
when running the main.py script on a local machine with 4 GPUS, I get the following error:
Duplicate GPU detected : rank 0 and rank 1 both on CUDA device 40
nvidia-smi shows 4 different GPUs, and the issue persists after reboot.
Hello, thanks for your sharing,
I was littile confused about your 1% In1k semi-sup evaluation. You said in paper that the results come from logistic regression on the extracted representations. However, with the same ViT, I found this evaluation of iBoT come from end2end full fintuning(see here), and SwAV et. all fintuned the entire res50 encoder.
Hello. Thank you for your great work!
I have some questions about the "AllReduce" class defined here.
Lines 226 to 241 in 4388dc1
And it is used in gathering probs when computing me-max regularization.
Lines 70 to 72 in 4388dc1
I wonder why not use "dist.all_reduce(x)" directly. It seems that using "AllReduce" multiply the gradient by "world_size" times.
I want to know whether i am correct and why this makes sense.
Thx!
No such file or directory: '/checkpoint/msn_os_logs/params-msn-train.yaml'
No such file or directory: '/checkpoint/msn_os_logs/msn-experiment-1_r0.csv'
I couldn't find an official code release for this paper arxiv.org/abs/2210.07277, in which an extension is proposed to MSN to allow arbitrary feature priors.
It looks like the main difference is a change of a single term in the loss function. Is that correct? How would I implement the changes mentioned in the PMSN paper?
Hey thanks so much for making this code public. I'm trying to use this in a project and would love to be able to continue the training process. Could you please upload a checkpoint with all of the optimizer state? Thanks so much for your help!
Hey, after few attempts, I would like to get some help :)
Thank you!
Hi! thank you guys to share such interesting research first.
I have some question when I read this part of source code.
# -- momentum schedule
_start_m, _final_m = 0.996, 1.0
_increment = (_final_m - _start_m) / (ipe * num_epochs * 1.25)
momentum_scheduler = (_start_m + (_increment*i) for i in range(int(ipe*num_epochs*1.25)+1))
# -- sharpening schedule
_increment_T = (_final_T - _start_T) / (ipe * num_epochs * 1.25)
sharpen_scheduler = (_start_T + (_increment_T*i) for i in range(int(ipe*num_epochs*1.25)+1))
Why we need multiply 1.25 for the num_epochs? In the paper, it write "with a momentum value of 0.996, and linearly increase this value to 1.0 by the end of training", but in this situation it can only increase to 0.9992.
I only get 66.9% accuracy for the released ViT-S checkpoint, which is lower than the reported 67.2%. I use the provided default seeting for logistic regression.
cyan.preprocess(embs, normalize=normalize, columns=False, centering=True)
classifier = cyan.MultiClassifier(loss='multiclass-logistic', penalty=penalty, fit_intercept=False)
classifier.fit(embs, labs, it0=10, lambd=lambd, lambd2=lambd, nthreads=-1, tol=1e-3, solver='auto', seed=0, max_epochs=300)
Besides, I set --blocks=1, --lambd=0.0025, --penalty=l2,--normalize=True
.
Is there something wrong?
Hi, would you be interested in adding msn to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community. Models/datasets/spaces(web demos) can be added to a user account or organization similar to github.
Example from other organizations:
Keras: https://huggingface.co/keras-io
Microsoft: https://huggingface.co/microsoft
Facebook: https://huggingface.co/facebook
Example spaces with repos:
github: https://github.com/salesforce/BLIP
Spaces: https://huggingface.co/spaces/salesforce/BLIP
github: https://github.com/facebookresearch/omnivore
Spaces: https://huggingface.co/spaces/akhaliq/omnivore
and here are guides for adding spaces/models/datasets to your org
How to add a Space: https://huggingface.co/blog/gradio-spaces
how to add models: https://huggingface.co/docs/hub/adding-a-model
uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html
Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.
@MidoAssran and team great work !!!
I will like to use this for my custom dataset to build a linear classification model. I started the model on my custom dataset by modifying the config/pretrain/msn_vits16.yaml
criterion:
ent_weight: 0.0
final_sharpen: 0.25
me_max: true
memax_weight: 1.0
num_proto: 1024
start_sharpen: 0.25
temperature: 0.1
batch_size: 64
use_ent: true
use_sinkhorn: true
data:
color_jitter_strength: 0.5
pin_mem: true
num_workers: 10
image_folder: custom_db/
label_smoothing: 0.0
patch_drop: 0.15
rand_size: 224
focal_size: 96
rand_views: 1
focal_views: 10
root_path: dataset/
logging:
folder: saved_models/msn_os_logs/
write_tag: msn-experiment-1
meta:
bottleneck: 1
copy_data: false
drop_path_rate: 0.0
hidden_dim: 2048
load_checkpoint: false
model_name: deit_small
output_dim: 256
read_checkpoint: null
use_bn: true
use_fp16: false
use_pred_head: false
optimization:
clip_grad: 3.0
epochs: 350
final_lr: 1.0e-06
final_weight_decay: 0.4
lr: 0.001
start_lr: 0.0002
warmup: 15
weight_decay: 0.04
and start the model pre-training using this cmd
python main.py --fname configs/pretrain/msn_vits16.yaml --devices cuda:0
was able to start the model training as seen in the below screenshots
confused on how to start the downstream task of image classification
Question :
Any help will be appreciated
thank you very much
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.