Giter Site home page Giter Site logo

sthalles / simclr Goto Github PK

View Code? Open in Web Editor NEW
2.2K 20.0 454.0 82.46 MB

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Home Page: https://sthalles.github.io/simple-self-supervised-learning/

License: MIT License

Python 28.98% Jupyter Notebook 71.02%
machine-learning deep-learning representation-learning pytorch-implementation pytorch torchvision unsupervised-learning contrastive-loss simclr

simclr's Introduction

PyTorch SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

DOI

Image of SimCLR Arch

Installation

$ conda env create --name simclr --file env.yml
$ conda activate simclr
$ python run.py

Config file

Before running SimCLR, make sure you choose the correct running configurations. You can change the running configurations by passing keyword arguments to the run.py file.

$ python run.py -data ./datasets --dataset-name stl10 --log-every-n-steps 100 --epochs 100 

If you want to run it on CPU (for debugging purposes) use the --disable-cuda option.

For 16-bit precision GPU training, there NO need to to install NVIDIA apex. Just use the --fp16_precision flag and this implementation will use Pytorch built in AMP training.

Feature Evaluation

Feature evaluation is done using a linear model protocol.

First, we learned features using SimCLR on the STL10 unsupervised set. Then, we train a linear classifier on top of the frozen features from SimCLR. The linear model is trained on features extracted from the STL10 train set and evaluated on the STL10 test set.

Check the Open In Colab notebook for reproducibility.

Note that SimCLR benefits from longer training.

Linear Classification Dataset Feature Extractor Architecture Feature dimensionality Projection Head dimensionality Epochs Top1 %
Logistic Regression (Adam) STL10 SimCLR ResNet-18 512 128 100 74.45
Logistic Regression (Adam) CIFAR10 SimCLR ResNet-18 512 128 100 69.82
Logistic Regression (Adam) STL10 SimCLR ResNet-50 2048 128 50 70.075

simclr's People

Contributors

alessiamarcolini avatar butyuhao avatar sthalles avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simclr's Issues

Whether there is evaluation code that runs locally

Dear researcher,
Thank you for the open-source code you provided,it is of great help to me for the understanding the SimCLR.
Your code is perfect,But I want to ask whether there is evaluation code that runs locally without the google colab,or I can how to amend the code make the eval code just in the local because I can't have the google coLab in the China. I hope you can give me some tips if you are free.
Thanks!
Chen He.

Issue with batch-size

In function info_nce_loss, the line 28, creates labels based on batch_size and on other side we have STL10 dataset which has 100,000 images which is divisible by batch_size of 32 and having batch_size like 128 or 64 gives a remainder of 32.

Having batch_size != 32, causes error in line 42, because the similarity matrix will based on features and labels will be based on batch size.

For instance, if the batch size = 128, the remaining images in the dataset in the last iter of data_loader is 32. Since we create two variant of each image we'll have 64 images. Now we have 128 x 2 = 256 labels from line 28, and we'll have similarity matrix of (64 x 128, 128 x 64) => (64 x 64) but with mask (256 x 256) causing "dimension mismatch"

Solution:
Change Line 28 as below

labels = torch.cat([torch.arange(features.shape[0]//2) for i in range(self.args.n_views)], dim=0)

image

I can't run this error in simclr.py,how to solve it.I'm going crazy.

this is error:
Files already downloaded and verified
0%| | 0/390 [00:06<?, ?it/s]
Traceback (most recent call last):
File "run.py", line 90, in
main()
File "run.py", line 86, in main
simclr.train(train_loader)
File "D:\SimCLR-master\simclr.py", line 71, in train
for images, _ in tqdm(train_loader):
File "D:\miniconda3\envs\SimCLR-Han\lib\site-packages\tqdm\std.py", line 1195, in iter
for obj in iterable:
File "D:\miniconda3\envs\SimCLR-Han\lib\site-packages\torch\utils\data\dataloader.py", line 352, in iter
return self._get_iterator()
File "D:\miniconda3\envs\SimCLR-Han\lib\site-packages\torch\utils\data\dataloader.py", line 294, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "D:\miniconda3\envs\SimCLR-Han\lib\site-packages\torch\utils\data\dataloader.py", line 801, in init
w.start()
File "D:\miniconda3\envs\SimCLR-Han\lib\multiprocessing\process.py", line 112, in start
self._popen = self._Popen(self)
File "D:\miniconda3\envs\SimCLR-Han\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "D:\miniconda3\envs\SimCLR-Han\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "D:\miniconda3\envs\SimCLR-Han\lib\multiprocessing\popen_spawn_win32.py", line 89, in init
reduction.dump(process_obj, to_child)
File "D:\miniconda3\envs\SimCLR-Han\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
OSError: [Errno 22] Invalid argument

About loss function

Hi,

Thank you for the great work and I am trying to use your code in 3D patches. I separately input two paired datasets, which contain the domain difference and didn't use the data augmentation. I have extracted the representation using an encoder. However, the contrastive loss calculation is zero. Is there any steps that I haven't done to run the code successfully?

Thanks a lot!

Loss function

Hi.
Thanks for your great work. But I have a little confusion. You implement the contrastive loss by Cross-Entropy Loss without softmax function. So the negatives actually didn't work but only positives.

Conda requirements broken

Running the command

$ conda create --name simclr python=3.7 --file requirements.txt

gives the following error:

Collecting package metadata (current_repodata.json): done
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - pyasn1==0.4.8=pypi_0
  - grpcio==1.27.2=pypi_0
  - google-auth==1.11.3=pypi_0
  - idna==2.9=pypi_0
  - google-auth-oauthlib==0.4.1=pypi_0
  - tensorboard==2.1.1=pypi_0
  - requests-oauthlib==1.3.0=pypi_0
  - requests==2.23.0=pypi_0
  - markdown==3.2.1=pypi_0
  - pyyaml==5.3=pypi_0
  - cachetools==4.0.0=pypi_0
  - werkzeug==1.0.0=pypi_0
  - absl-py==0.9.0=pypi_0
  - pytorch==1.4.0=py3.7_cuda10.1.243_cudnn7.6.3_0
  - oauthlib==3.1.0=pypi_0
  - urllib3==1.25.8=pypi_0
  - pyasn1-modules==0.2.8=pypi_0
  - protobuf==3.11.3=pypi_0
  - chardet==3.0.4=pypi_0
  - rsa==4.0=pypi_0
  - torchvision==0.5.0=py37_cu101

Current channels:

  - https://conda.anaconda.org/conda-forge/linux-64
  - https://conda.anaconda.org/conda-forge/noarch
  - https://repo.anaconda.com/pkgs/main/linux-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/linux-64
  - https://repo.anaconda.com/pkgs/r/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

GPU utilization rate is low

Hi, thanks for the code!

When I tried to run it on single GPU (v-100), the utilizaiton rate is very low (~0-10%) even if I increase num_worker. Would you know why this happens and how to solve it? Thanks!

Info NCE loss

Hi, may I ask how you were able to calculate the info nce loss in this work? I am confused on the methodology as it is quite different from the code of the authors.

You are returning labels of all 0 because you only want to calculate negative labels. However in this code here, you used the logits for both the negative samples and the positive sample (I'm assuming this is the augmented counterpart of the image). May I ask the reasoning for this kind of implementation?

SimCLR/simclr.py

Lines 51 to 55 in 1848fc9

logits = torch.cat([positives, negatives], dim=1)
labels = torch.zeros(logits.shape[0], dtype=torch.long).to(self.args.device)
logits = logits / self.args.temperature
return logits, labels

P.S.: I am still at loss currently on how you were able to simplify the code to just calculating only the negative samples. Hopefully this can be clarified in your reply. Thank you!

NT_Xent Loss function: all negatives are not being used?

Hi @sthalles , Thank you for sharing your code!

Pl correct me if I am wrong:
I see that in line loss/nt_xent.py line 57 (below) you are not computing contrastive loss for all negative pairs as you are reshaping total negatives in 2D array i.e. only a part of negative pairs are being used for a single positive pair, right? :

_negatives = similarity_matrix[self.mask_samples_from_same_repr].view(2 * self.batch_size, -1)_
_logits = torch.cat((positives, negatives), dim=1)_

Hope to hear from you soon.

-Ishan

Confusion matrix

Does anyone know how to add the confusion matrix in this code? After I added it according to the online one, something went wrong. I don't know what went wrong in my code.I can't solve it. please help help me! Thanks.
def confusion_matrix(output, labels, conf_matrix):

preds = torch.argmax(output, dim=-1)
for p, t in zip(preds, labels):
    conf_matrix[p, t] += 1
return conf_matrix

Run with error

Hi,

Thank you for the great work. Follow README, I have a problem while running the code.
simclr

Should assert n_views == 2?

Thanks for your excellent implementation! I'd like to confirm that N_VIEW == 2 as in the paper and the default args in the code. If N_VIEW > 2, with logits.shape = (N_VIEW x N, N_VIEW x N - 1)

logits = torch.cat([positives, negatives], dim=1)
, N_VIEW x N - 1 contains at least one more positive pairs (except the one indexed with 0) which will be treated as negative pairs. @sthalles @alessiamarcolini @butyuhao

training log

Hi,
Do you still have the log of pre-training? I want to know how the loss changes for every epoch and the accuracy of the positive examples in each batch

Calculating acc in training

Hi, I have a question about calculating an Accuracy in training mode of SimCLR model. How is it working, or how is it possible, that you can compute accuracy when you are training on data without labels? Thanks a lot!

ModuleNotFoundError: No module named 'torch.cuda'

I am using pythion 3.7 on Win10, Anaconda Jupyter. I have successfully installed torch-1.10.0+cu113 torchaudio-0.10.0+cu113 torchvision-0.11.1+cu113.
When trying to import torch , I get ModuleNotFoundError: No module named 'torch.cuda'
Detailed error:

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-bfd2c657fa76> in <module>
      1 import numpy as np
      2 import pandas as pd
----> 3 import torch
      4 import torch.nn as nn
      5 from sklearn.model_selection import train_test_split

~\AppData\Roaming\Python\Python38\site-packages\torch\__init__.py in <module>
    603 
    604 # Shared memory manager needs to know the exact location of manager executable
--> 605 _C._initExtension(manager_path())
    606 del manager_path
    607 

ModuleNotFoundError: No module named 'torch.cuda'

I found posts for similar error No module named 'torch.cuda.amp'. However, any of the suggested solutions worked. Please advise.

evaluation code batch_size & validation process

I'm really appreciated about your good work :)
I left a question because I got confused while studying through your great code.

First, I wonder why you used "batch_size=batch_size*2" differently from train_loader in the test_loader part of the file "mini_batch_logistic_regression_valuator.ipynb". Is it related to creating 2 views when doing data augmentation?

Also, in the last cell of this file, I'm confused whether the second "for" (of the two "for") in the large epoch "for" statement corresponds to the test process or the validation process. I thought it was a test process, because loss update, backpropagation, optimization, etc. were done only in the first "for", and the second yield only accuracy, but is that right? Or I'm confused if the second "for" is a validating process because the first "for" and the second "for" are going together in the entire epoch processing.

Why cos_sim after L2 norm?

Hi, This code is really useful for me. Thanks!
But I got a question about the NT-Xent loss. I noticed that you use L2 norm on z and then use cos_similarity after that. But cos_similarity already contain the function of l2 norm. Why use L2 norm first?

Question about CE Loss

Hello,

Thanks for sharing the code, nice implementation.

The way you calculate the loss by using a mask is quite brilliant. But I have a question.

logits = torch.cat((positives, negatives), dim=1)
So if I'm not wrong, the first column of logits is positive and the rest are negatives.

labels = torch.zeros(2 * self.batch_size).to(self.device).long()
But your labels are all zeros, which means no matter positive or negative, the similarity should low.

So I wonder is the first column of labels supposed to be 1 instead of 0.

Thanks for your help.

'CosineAnnealingLR' never works with the wrong position of 'scheduler.step()'

Considering the setting in 'scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=len(train_loader), eta_min=0, last_epoch=-1)',I think 'scheduler.step()' should be called every step in 'for (xis,xjs),_ in train_loader'. Otherwise,lr will nerver change until 'len(train_loader)' epochs but not steps

Validation Loss calculation

First of all, thank you for your great work!

Method _validate in simclr.py will raise ZeroDivisionError at line 148 if the validation data loader performs only one iteration (since counter starts from 0).

Run experiments on ImageNet

Hi,
Thanks for your nice work.
I am planning to run SimCLR on ImageNet dataset. I wonder if I need to adjust the structure of network or add some tricks. For example increasing the dimension of output used to calculating losses, which in your code is 64. Or directly change the dataset to ImageNet and keep other configuration the same?
I'll appreciate any advice.

Permission Denied with the model download link

Hi sthalles,
Thanks for your great implementation. When I run your linear_feature_eval.ipynb, there is one error about the model download link, which is:
Permission denied: https://drive.google.com/uc?id=1LjuZ1RmhotrnugprRQc2Exk0EbQHMJhL
Maybe you need to change permission over 'Anyone with the link'?
unzip: cannot find or open Mar14_05-52-52_thallessilva, Mar14_05-52-52_thallessilva.zip or Mar14_05-52-52_thallessilva.ZIP.
Could you change the link download permission? Thanks a lot.

keyword arguments to the run.py file

for smooth execution use run.py use following command

BEFORE : $ python run.py -data ./datasets --dataset-name stl10 --log-every-n-steps 100 --epochs 100
AFTER : $ python run.py -data ./datasets -dataset-name stl10 --log-every-n-steps 100 --epochs 100

Or else you can make change in run.py at line number 16 from parser.add_argument('-dataset-name', default='stl10',help='dataset name', choices=['stl10', 'cifar10']) to parser.add_argument('--dataset-name', default='stl10', help='dataset name', choices=['stl10', 'cifar10'])

No upscale in image augmentation?

The SimCLR paper says:

In this work, we sequentially apply three simple augmentations: random
cropping followed by resize back to the original size, random color distortions, and random Gaussian blur

but it seems like the augmentations used in this repository first do a random crop, but do not afterwards resize the crop back to the original size. Why the difference? Am I misunderstanding the SimCLR paper?

A question about the "labels"

Hi! I have a question about the definition of "labels" in the script "simclr.py".

On line 54 of "simclr.py", the authors defined:

labels = torch.zeros(logits.shape[0], dtype=torch.long).to(self.args.device)

So all the entries of "labels" are all zeros. But I think according to the paper, there should be an entry as 1 for the positive pair?

Thanks in advance for your reply!

Loading pretrained model weights

The code uses state_dict = torch.load for pretrained model but I was not able to get it to use pretrained weights for resnets. Any suggestions?

Loss function and optimizer

Hi Thalles,
I went through the code and find two things I can't understand:
(1) in code "labels = torch.zeros(2 * self.batch_size).to(self.device).long()", in nt_xent.py, seems the label is constantly 1. So the label is not used since its all 0?
(2) is adam + "scheduler = torch.optim.lr_scheduler.CosineAnnealingLR" same as the LARS optimizer?

Thanks in advance.

Why is there no validation loss?

Hi, thanks very much for providing the code and framework!

Can I check why the code does not monitor loss evolution for a separate validation set? I see from the closed issues that there used to be one, but it seems to have been removed in latest. I'm sure validation loss should still be monitored to ensure model learns proper features right?

Thanks!
Michael

Calculation of the similarity

Hi! Thank you for your great work!
I'm a bit curious here about how you calculated the cosine similarity.
The code just put the similarity calculation with similarity_matrix = torch.matmul(features, features.T).

size of tensors in cosine_simiarity function

Hi , I'm trying to understand the code in :
loss/nt_xent.py

we are sending "representations" on both arguments

    def forward(self, zis, zjs):
        representations = torch.cat([zjs, zis], dim=0)
        similarity_matrix = self.similarity_function(representations, representations)

But when receiving it in cosine_similarity func somehow the sizes are:
(N, 1, C) and y shape: (1, 2N, C), how can it be double if you sent the same argument

    def _cosine_simililarity(self, x, y):
        # x shape: (N, 1, C)
        # y shape: (1, 2N, C)
        # v shape: (N, 2N)
        v = self._cosine_similarity(x.unsqueeze(1), y.unsqueeze(0))
        return v

Thanks for your help.

How do i train the SimCLR model with my local dataset?

Dear researcher,
Thank you for the open-source code you provided, it is of great help to me for understanding contrastive learning.
But I still have some confusion when training the SimCLR model with my local dataset, could you give me some guidance or tips? I would appreciate it if you could reply to this issue.

Similarity matrix shape does not match the shape of the mask

Hello,

I was currently testing the implementation when an error occured: The shape of the mask [512, 512] at index 0 does not match the shape of the indexed tensor [2, 2] at index 0.
My batch size is 256.

The error occurs in this part of the code:
similarity_matrix = torch.matmul(features, features.T)
mask = torch.eye(labels.shape[0], dtype=torch.bool).to(device)
labels = labels[~mask].view(labels.shape[0], -1)
similarity_matrix = similarity_matrix[~mask].view(similarity_matrix.shape[0], -1)

I'm wondering if this something I'm doing wrong and how do I match the shape of tensors?

Thanks in advance!

Review Training | Fine-Tune | Test details

Hi, I just want to check all the experiments details and make sure I didn't miss any part(?

  1. Training Phase : use SimCLR (two encoder branches) to train on ImageNet for 1000 epochs to get a init pretrained weights.
  2. Fine-Tuned : load the init pretrained weights on the resnet18(50/101/...) with freezed parameters and concate with a linear classifier, and train the classifier with CIFAR10/STL10 training dataset for 100 epochs.
  3. Test Phase : freeze all the encoder, classifier parameters, and test on the CIFAR10/STL10 testing dataset.

Is this the way how you get the top1 acc in the README?

batch size affect

Hi, I'm trying to experiment with CIFAR-10 with the default hyper-params, and it seems to yield a better score when using smaller batch size (e.g. 72% with batch size 256 yet 78% with batch size 128). Anyone in the same situation, here?

Reproduced Results

Hi @sthalles ! Thank you very much for the great effort!

Is the table in your readme contains the results you reproduced? If so, have you ever considered using ResNet50 as backbone? As The SimCLR paper mainly uses ResNet50.

Thanks!

About learning rate schedule

Hi, Thalles.

SimCLR/run.py

Line 79 in 1848fc9

scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=len(train_loader), eta_min=0,

Shouldn't T_max=args.epochs instead of T_max=len(train_loader) since learning rate schedule happens at every epoch?
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.