google-research / syn-rep-learn Goto Github PK

View Code? Open in Web Editor NEW

264.0 11.0 12.0 5.02 MB

Learning from synthetic data - code and models

License: Apache License 2.0

Python 100.00%

syn-rep-learn's Introduction

Syn-Rep-Learn

This repo contains research studies of learning from synthetic data (mostly images), including:

StableRep - StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners. Yonglong Tian*, Lijie Fan*, Phillip Isola, Huiwen Chang, Dilip Krishnan.
Scaling - Scaling Laws of Synthetic Images for Model Training ... for Now. Lijie Fan*, Kaifeng Chen, Dilip Krishnan, Dina Katabi, Phillip Isola, Yonglong Tian*.
SynCLR - Learning Vision from Models Rivals Learning Vision from Data. Yonglong Tian*, Lijie Fan*, Kaifeng Chen, Dina Katabi, Dilip Krishnan, Phillip Isola.

Disclaimer

This is not an officially supported Google product.

License

Apache2 license.

Contact

Yonglong Tian ([email protected])
Lijie Fan ([email protected])

syn-rep-learn's People

Contributors

Stargazers

Watchers

Forkers

middle-membership600 sorokinvld hhhtty joshzyj evdcush eltociear 4enigma ailearnwjf bnwiran mpezeshki sailfish009 c7878783

syn-rep-learn's Issues

Have you evaluated CLIP and StableRep++ on imagenet zero-shot classification with laion50m subsets?

Dear Authors,

I appreciate for your exceptional contributions to the field, particularly regarding your works on syn-rep-learn.

From your "stablerep" paper, particularly the scaling effects of linear probing shown in Figure 6, I discovered these do not seem to be consistent in ImageNet zero-shot classification using provided CLIP-based checkpoints.
(CLIP shows better accuracy than StableRep++ for larger pretraining samples)

Below, I've included the code I employed to generate these results, adapted from your repositories ("StableRep" and "Scaling").
The command and its corresponding output are as follows:

stablerep.zip

# model checkpoints are automatically downloaded from dropbox

# CLIP ViT-B-16
python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion3m:CLIP_vitb16
[laion3m:CLIP_vitb16]  ImageNet zero-shot accuracy: 21.83

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion10m:CLIP_vitb16
[laion10m:CLIP_vitb16]  ImageNet zero-shot accuracy: 40.732

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion20m:CLIP_vitb16
[laion20m:CLIP_vitb16]  ImageNet zero-shot accuracy: 45.754

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion50m:CLIP_vitb16
[laion50m:CLIP_vitb16]  ImageNet zero-shot accuracy: 49.564

# StableRep-pp
python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion3m:StableRep-pp_vitb16
[laion3m:StableRep-pp_vitb16]  ImageNet zero-shot accuracy: 31.71

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion10m:StableRep-pp_vitb16
[laion10m:StableRep-pp_vitb16]  ImageNet zero-shot accuracy: 40.86

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion20m:StableRep-pp_vitb16
[laion20m:StableRep-pp_vitb16]  ImageNet zero-shot accuracy: 43.614

python eval_imagenet_clip.py --data-path /home/appuser/datasets/imagenet --model laion50m:StableRep-pp_vitb16
[laion50m:StableRep-pp_vitb16]  ImageNet zero-shot accuracy: 44.886

Given these observations, I am curious to understand whether there might be an oversight on my part or if this phenomenon reflects the model's behavior.
Could you possibly give any idea on this discrepancy?

Best regards,

Small Typo

Line 40 environment.yml
should be - pillow==9.4.0
not - pillow=9.4.0

Question on MultiPosConLoss and local_batch_size != self.last_local_batch_size

Hi,

Thanks for the contribution and updated code on Supervised Contrastive Learning.
My question is related to this part of the loss:
https://github.com/google-research/syn-rep-learn/blob/main/StableRep/models/losses.py#L79

local_batch_size = feats.size(0)
...
# Create label matrix, since in our specific case the
# label matrix in side each batch is the same, so
# we can just create it once and reuse it. For other
# cases, user need to compute it for each batch

if local_batch_size != self.last_local_batch_size:
     etc....

My understanding is that, for a given batch in distributed setting, the label tensor (after all_gather) will be identical across all the gpus so no need to compute it multiple times. Just once per batch is enough.

My question is then on the condition local_batch_size != self.last_local_batch_size:: Why the check is done on the batch size and not on the tensors values, isn't the batch size pretty much the same during training ?

Thank you !

Prompt generation by SynCLR

Hi, thanks for your great work. As for SynCLR, you first generate prompts, then generate the image based on the prompt. There are examples for some datasets in Folder SYN_TEXT, such as ImageNet, Food101. Are there examples for other datasets, such as STL-10 and CIFAR-100?

Linear probing duration

I currently run the linear probing evaluation for Imagenet distributed on 2 NVIDIA GTX 1080. It currently takes 4 hours for one epoch with 90 epochs to do. Which durations do you have experienced while your experiments?

google-research / syn-rep-learn Goto Github PK

syn-rep-learn's Introduction

Syn-Rep-Learn

Disclaimer

License

Contact

syn-rep-learn's People

Contributors

Stargazers

Watchers

Forkers

syn-rep-learn's Issues

Have you evaluated CLIP and StableRep++ on imagenet zero-shot classification with laion50m subsets?

Small Typo

Question on MultiPosConLoss and local_batch_size != self.last_local_batch_size

Prompt generation by SynCLR

Linear probing duration

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent