facebookresearch / pycls Goto Github PK

Codebase for Image Classification Research, written in PyTorch.

License: MIT License

Python 99.76% Shell 0.24%

pycls's Introduction

pycls

pycls is an image classification codebase, written in PyTorch. It was originally developed for the On Network Design Spaces for Visual Recognition project. pycls has since matured and been adopted by a number of projects at Facebook AI Research.

pycls provides a large set of baseline models across a wide range of flop regimes.

Introduction

The goal of pycls is to provide a simple and flexible codebase for image classification. It is designed to support rapid implementation and evaluation of research ideas. pycls also provides a large collection of baseline results (Model Zoo). The codebase supports efficient single-machine multi-gpu training, powered by the PyTorch distributed package, and provides implementations of standard models including ResNet, ResNeXt, EfficientNet, and RegNet.

Using pycls

Please see GETTING_STARTED for brief installation instructions and basic usage examples.

Model Zoo

We provide a large set of baseline results and pretrained models available for download in the pycls Model Zoo; including the simple, fast, and effective RegNet models that we hope can serve as solid baselines across a wide range of flop regimes.

Sweep Code

The pycls codebase now provides powerful support for studying design spaces and more generally population statistics of models as introduced in On Network Design Spaces for Visual Recognition and Designing Network Design Spaces. This idea is that instead of planning a single pycls job (e.g., testing a specific model configuration), one can study the behavior of an entire population of models. This allows for quite powerful and succinct experimental design, and elevates the study of individual model behavior to the study of the behavior of model populations. Please see SWEEP_INFO for details.

Projects

A number of projects at FAIR have been built on top of pycls:

If you are using pycls in your research and would like to include your project here, please let us know or send a PR.

Citing pycls

If you find pycls helpful in your research or refer to the baseline results in the Model Zoo, please consider citing an appropriate subset of the following papers:

@InProceedings{Radosavovic2019,
  title = {On Network Design Spaces for Visual Recognition},
  author = {Ilija Radosavovic and Justin Johnson and Saining Xie Wan-Yen Lo and Piotr Doll{\'a}r},
  booktitle = {ICCV},
  year = {2019}
}

@InProceedings{Radosavovic2020,
  title = {Designing Network Design Spaces},
  author = {Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Doll{\'a}r},
  booktitle = {CVPR},
  year = {2020}
}

@InProceedings{Dollar2021,
  title = {Fast and Accurate Model Scaling},
  author = {Piotr Doll{\'a}r and Mannat Singh and Ross Girshick},
  booktitle = {CVPR},
  year = {2021}
}

License

pycls is released under the MIT license. Please see the LICENSE file for more information.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.

pycls's People

Contributors

Stargazers

Watchers

Forkers

ml-lab wh-forker laksh9950 pilhyeon satishjasthi hzhang57 kris-singh orashi ishann modasshir pikqu tuanthng btajini kuan-li happog kuyun-zhangyang aixioma ceac33 kcyu2014 huangzehao cavalleria dotrado ericmintun ishanchaks91 jiazewang hello-debussy sailfish009 l1129433134 bveerama yapdianang dudu-github tiberium24 lliai shaotengliu aptsunny amengi vandesa003 dlsnowman robot-ai-machinelearning zymale zhaojp-frank xinxin12345 3-leaves-grass shaojiawei07 barrylee9527 d123456ddq wuqiman scape1989 guolong-zhang mrku69 rotorliu xyy19920105 lihaossu hwfan ofsoundof visenzeadam marmotatzju shippingwang peternara zm786955593 rodgeliao ngoc-nguyen-nis xiaozhaxie777 likeafoolqvq roy699 qixiuai yuhonghong95721 geoffzhang qq2737499951 yongjunhe11 xrosliang fungungun uelordi01 sboulanger shoufachen acabadw22 peterouzh codeaudit nathanlem1 saichanda azuresilent sangkwun fengqian1989 tsukuyomih2 hyunghunny rogerbao youtang1993 gaimjkp baiti01 sjq5263 ayush488 zhengxiawu jasondoinggreat randl wilile26811249 wobjtushisui huangzongheng tuggeluk cvbranch samjkwong

pycls's Issues

size mismatch for stem.conv.weight: copying a param of torch.Size([64, 3, 7, 7]) from checkpoint, where the shape is torch.Size([16, 3, 3, 3]) in current model. size mismatch for stem.bn.weight: copying a param of torch.Size([64]) from checkpoint, where the shape is torch.Size([16]) in current model.

size mismatch for stem.conv.weight: copying a param of torch.Size([64, 3, 7, 7]) from checkpoint, where the shape is torch.Size([16, 3, 3, 3]) in current model.
size mismatch for stem.bn.weight: copying a param of torch.Size([64]) from checkpoint, where the shape is torch.Size([16]) in current model.

How to infer single image?

I want to know the class of image

Waiting For the RegNet

Hi，thx for the codebase.
I wonder when to release the RegNet pre-trained models.

About Fig.5 in RegNet paper.

Hi, I'm confused about the Fig. 5 (left, middle) in the RegNet paper.

I know Fig. 5(left) shows the results which are under the condition that sharing bottleneck ratio b_i = b， but which b do you choose to get the results? Or for every specific value of b, the conclusions are same? The same confusion about the group g in the middle figure.

Did I miss something?

Cannot load pretrain model.

Thx for nice sharing!

My env:
Win10 + pytorch1.6

While I try to use
model = pycls.models.regnety(model_cate, pretrained=True)

Just got bad result, which equal when I set pretrained=False.

And, I also try to load by myself,
model.load_state_dict(torch.load(load_path),strict=True)

Got error below:

model.load_state_dict(torch.load(load_path),strict=True) File "F:\Software\Anaconda\envs\pth\lib\site-packages\torch\nn\modules\module.py", line 1045, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for RegNet: Missing key(s) in state_dict: "stem.conv.weight", "stem.bn.weight", "stem.bn.bias", "stem.bn.running_mean", "stem.bn.running_var", "s1.b1.proj.weight", "s1.b1.bn.weight", "s1.b1.bn.bias", "s1.b1.bn.running_mean", "s1.b1.bn.running_var", "s1.b1.f.a.weight", "s1.b1.f.a_bn.weight", "s1.b1.f.a_bn.bias", "s1.b1.f.a_bn.running_mean", "s1.b1.f.a_bn.running_var", "s1.b1.f.b.weight", "s1.b1.f.b_bn.weight", "s1.b1.f.b_bn.bias", "s1.b1.f.b_bn.running_mean", "s1.b1.f.b_bn.running_var", "s1.b1.f.se.f_ex.0.weight", "s1.b1.f.se.f_ex.0.bias", "s1.b1.f.se.f_ex.2.weight", "s1.b1.f.se.f_ex.2.bias", "s1.b1.f.c.weight", "s1.b1.f.c_bn.weight", "s1.b1.f.c_bn.bias", "s1.b1.f.c_bn.running_mean", "s1.b1.f.c_bn.running_var", "s2.b1.proj.weight", "s2.b1.bn.weight", "s2.b1.bn.bias", "s2.b1.bn.running_mean", "s2.b1.bn.running_var", "s2.b1.f.a.weight", "s2.b1.f.a_bn.weight", "s2.b1.f.a_bn.bias", "s2.b1.f.a_bn.running_mean", "s2.b1.f.a_bn.running_var", "s2.b1.f.b.weight", "s2.b1.f.b_bn.weight", "s2.b1.f.b_bn.bias", "s2.b1.f.b_bn.running_mean", "s2.b1.f.b_bn.running_var", "s2.b1.f.se.f_ex.0.weight", "s2.b1.f.se.f_ex.0.bias", "s2.b1.f.se.f_ex.2.weight", "s2.b1.f.se.f_ex.2.bias", "s2.b1.f.c.weight", "s2.b1.f.c_bn.weight", "s2.b1.f.c_bn.bias", "s2.b1.f.c_bn.running_mean", "s2.b1.f.c_bn.running_var", "s3.b1.proj.weight", "s3.b1.bn.weight", "s3.b1.bn.bias", "s3.b1.bn.running_mean", "s3.b1.bn.running_var", "s3.b1.f.a.weight", "s3.b1.f.a_bn.weight", "s3.b1.f.a_bn.bias", "s3.b1.f.a_bn.running_mean", "s3.b1.f.a_bn.running_var", "s3.b1.f.b.weight", "s3.b1.f.b_bn.weight", "s3.b1.f.b_bn.bias", "s3.b1.f.b_bn.running_mean", "s3.b1.f.b_bn.running_var", "s3.b1.f.se.f_ex.0.weight", "s3.b1.f.se.f_ex.0.bias", "s3.b1.f.se.f_ex.2.weight", "s3.b1.f.se.f_ex.2.bias", "s3.b1.f.c.weight", "s3.b1.f.c_bn.weight", "s3.b1.f.c_bn.bias", "s3.b1.f.c_bn.running_mean", "s3.b1.f.c_bn.running_var", "s3.b2.f.a.weight", "s3.b2.f.a_bn.weight", "s3.b2.f.a_bn.bias", "s3.b2.f.a_bn.running_mean", "s3.b2.f.a_bn.running_var", "s3.b2.f.b.weight", "s3.b2.f.b_bn.weight", "s3.b2.f.b_bn.bias", "s3.b2.f.b_bn.running_mean", "s3.b2.f.b_bn.running_var", "s3.b2.f.se.f_ex.0.weight", "s3.b2.f.se.f_ex.0.bias", "s3.b2.f.se.f_ex.2.weight", "s3.b2.f.se.f_ex.2.bias", "s3.b2.f.c.weight", "s3.b2.f.c_bn.weight", "s3.b2.f.c_bn.bias", "s3.b2.f.c_bn.running_mean", "s3.b2.f.c_bn.running_var", "s3.b3.f.a.weight", "s3.b3.f.a_bn.weight", "s3.b3.f.a_bn.bias", "s3.b3.f.a_bn.running_mean", "s3.b3.f.a_bn.running_var", "s3.b3.f.b.weight", "s3.b3.f.b_bn.weight", "s3.b3.f.b_bn.bias", "s3.b3.f.b_bn.running_mean", "s3.b3.f.b_bn.running_var", "s3.b3.f.se.f_ex.0.weight", "s3.b3.f.se.f_ex.0.bias", "s3.b3.f.se.f_ex.2.weight", "s3.b3.f.se.f_ex.2.bias", "s3.b3.f.c.weight", "s3.b3.f.c_bn.weight", "s3.b3.f.c_bn.bias", "s3.b3.f.c_bn.running_mean", "s3.b3.f.c_bn.running_var", "s3.b4.f.a.weight", "s3.b4.f.a_bn.weight", "s3.b4.f.a_bn.bias", "s3.b4.f.a_bn.running_mean", "s3.b4.f.a_bn.running_var", "s3.b4.f.b.weight", "s3.b4.f.b_bn.weight", "s3.b4.f.b_bn.bias", "s3.b4.f.b_bn.running_mean", "s3.b4.f.b_bn.running_var", "s3.b4.f.se.f_ex.0.weight", "s3.b4.f.se.f_ex.0.bias", "s3.b4.f.se.f_ex.2.weight", "s3.b4.f.se.f_ex.2.bias", "s3.b4.f.c.weight", "s3.b4.f.c_bn.weight", "s3.b4.f.c_bn.bias", "s3.b4.f.c_bn.running_mean", "s3.b4.f.c_bn.running_var", "s4.b1.proj.weight", "s4.b1.bn.weight", "s4.b1.bn.bias", "s4.b1.bn.running_mean", "s4.b1.bn.running_var", "s4.b1.f.a.weight", "s4.b1.f.a_bn.weight", "s4.b1.f.a_bn.bias", "s4.b1.f.a_bn.running_mean", "s4.b1.f.a_bn.running_var", "s4.b1.f.b.weight", "s4.b1.f.b_bn.weight", "s4.b1.f.b_bn.bias", "s4.b1.f.b_bn.running_mean", "s4.b1.f.b_bn.running_var", "s4.b1.f.se.f_ex.0.weight", "s4.b1.f.se.f_ex.0.bias", "s4.b1.f.se.f_ex.2.weight", "s4.b1.f.se.f_ex.2.bias", "s4.b1.f.c.weight", "s4.b1.f.c_bn.weight", "s4.b1.f.c_bn.bias", "s4.b1.f.c_bn.running_mean", "s4.b1.f.c_bn.running_var", "s4.b2.f.a.weight", "s4.b2.f.a_bn.weight", "s4.b2.f.a_bn.bias", "s4.b2.f.a_bn.running_mean", "s4.b2.f.a_bn.running_var", "s4.b2.f.b.weight", "s4.b2.f.b_bn.weight", "s4.b2.f.b_bn.bias", "s4.b2.f.b_bn.running_mean", "s4.b2.f.b_bn.running_var", "s4.b2.f.se.f_ex.0.weight", "s4.b2.f.se.f_ex.0.bias", "s4.b2.f.se.f_ex.2.weight", "s4.b2.f.se.f_ex.2.bias", "s4.b2.f.c.weight", "s4.b2.f.c_bn.weight", "s4.b2.f.c_bn.bias", "s4.b2.f.c_bn.running_mean", "s4.b2.f.c_bn.running_var", "s4.b3.f.a.weight", "s4.b3.f.a_bn.weight", "s4.b3.f.a_bn.bias", "s4.b3.f.a_bn.running_mean", "s4.b3.f.a_bn.running_var", "s4.b3.f.b.weight", "s4.b3.f.b_bn.weight", "s4.b3.f.b_bn.bias", "s4.b3.f.b_bn.running_mean", "s4.b3.f.b_bn.running_var", "s4.b3.f.se.f_ex.0.weight", "s4.b3.f.se.f_ex.0.bias", "s4.b3.f.se.f_ex.2.weight", "s4.b3.f.se.f_ex.2.bias", "s4.b3.f.c.weight", "s4.b3.f.c_bn.weight", "s4.b3.f.c_bn.bias", "s4.b3.f.c_bn.running_mean", "s4.b3.f.c_bn.running_var", "s4.b4.f.a.weight", "s4.b4.f.a_bn.weight", "s4.b4.f.a_bn.bias", "s4.b4.f.a_bn.running_mean", "s4.b4.f.a_bn.running_var", "s4.b4.f.b.weight", "s4.b4.f.b_bn.weight", "s4.b4.f.b_bn.bias", "s4.b4.f.b_bn.running_mean", "s4.b4.f.b_bn.running_var", "s4.b4.f.se.f_ex.0.weight", "s4.b4.f.se.f_ex.0.bias", "s4.b4.f.se.f_ex.2.weight", "s4.b4.f.se.f_ex.2.bias", "s4.b4.f.c.weight", "s4.b4.f.c_bn.weight", "s4.b4.f.c_bn.bias", "s4.b4.f.c_bn.running_mean", "s4.b4.f.c_bn.running_var", "s4.b5.f.a.weight", "s4.b5.f.a_bn.weight", "s4.b5.f.a_bn.bias", "s4.b5.f.a_bn.running_mean", "s4.b5.f.a_bn.running_var", "s4.b5.f.b.weight", "s4.b5.f.b_bn.weight", "s4.b5.f.b_bn.bias", "s4.b5.f.b_bn.running_mean", "s4.b5.f.b_bn.running_var", "s4.b5.f.se.f_ex.0.weight", "s4.b5.f.se.f_ex.0.bias", "s4.b5.f.se.f_ex.2.weight", "s4.b5.f.se.f_ex.2.bias", "s4.b5.f.c.weight", "s4.b5.f.c_bn.weight", "s4.b5.f.c_bn.bias", "s4.b5.f.c_bn.running_mean", "s4.b5.f.c_bn.running_var", "s4.b6.f.a.weight", "s4.b6.f.a_bn.weight", "s4.b6.f.a_bn.bias", "s4.b6.f.a_bn.running_mean", "s4.b6.f.a_bn.running_var", "s4.b6.f.b.weight", "s4.b6.f.b_bn.weight", "s4.b6.f.b_bn.bias", "s4.b6.f.b_bn.running_mean", "s4.b6.f.b_bn.running_var", "s4.b6.f.se.f_ex.0.weight", "s4.b6.f.se.f_ex.0.bias", "s4.b6.f.se.f_ex.2.weight", "s4.b6.f.se.f_ex.2.bias", "s4.b6.f.c.weight", "s4.b6.f.c_bn.weight", "s4.b6.f.c_bn.bias", "s4.b6.f.c_bn.running_mean", "s4.b6.f.c_bn.running_var", "s4.b7.f.a.weight", "s4.b7.f.a_bn.weight", "s4.b7.f.a_bn.bias", "s4.b7.f.a_bn.running_mean", "s4.b7.f.a_bn.running_var", "s4.b7.f.b.weight", "s4.b7.f.b_bn.weight", "s4.b7.f.b_bn.bias", "s4.b7.f.b_bn.running_mean", "s4.b7.f.b_bn.running_var", "s4.b7.f.se.f_ex.0.weight", "s4.b7.f.se.f_ex.0.bias", "s4.b7.f.se.f_ex.2.weight", "s4.b7.f.se.f_ex.2.bias", "s4.b7.f.c.weight", "s4.b7.f.c_bn.weight", "s4.b7.f.c_bn.bias", "s4.b7.f.c_bn.running_mean", "s4.b7.f.c_bn.running_var", "head.fc.weight", "head.fc.bias". Unexpected key(s) in state_dict: "epoch", "model_state", "optimizer_state", "cfg".

is warmup used for baseline training?

Hi, @rajprateek @ir413

For config files under configs/dds_baselines, only RegNetX and RegNetY set the WARMUP_EPOCH 5 .

And the default WARMUP_EPOCH value is 0 that is set in core/config.py. Does it mean that the baseline models (resnet, resneXt) are not trained with warmup?

Release additional pre-trained models (if available)

First, thank you for this contribution and the release of code and pre-trained models. I have trained two of the smaller settings and get similar error rates to the released models.

In the MODEL_ZOO.md readme, you mention that "the reported errors are averaged across 5 reruns for robust estimates". If these additional models (5 models per setting) are saved somewhere and match the current codebase, it would be a valuable addition to what is currently released. For instance, research in ensemble methods or analysis in variations across models would benefit greatly from this contribution.

For the larger settings (e.g., RegNetX-32GF at 76 train hours for 8 GPUs), training 5 models would take over two weeks on 8 GPUs, making it difficult for most researchers to do. Thanks!

how to use other dataset to train and test?

how to use other dataset to train and test?
what is the size of imagenet?

size mismatch for head.fc.bias: copying a param of torch.Size([1000]) from checkpoint, where the shape is torch.Size([10]) in current model.

How to get probability of the different classes for each image using test_net.py ?

I'm using the test_net.py script but i can't seem to get the probabilities for each image inferred in the val directory. any ideas ?

Fail to use torch.utils.tensorboard when training with multi-gpu

Hi all,
I was trying to log information with tensorboard so I saved loss and accuracy at the end of both function train_epoch and test_epoch. Everything was ok when training with only one gpu but failed with multi-gpu, that mean it will show "No dashboards are active for the current data set..." on the browser.
Does anyone face the situation as well?

ps. My environment setting was pulled from nvidia-docker image nvcr.io/nvidia/pytorch:19.10-py3
(docker image detail)

Thanks!

module 'pycls.core' has no attribute 'builders'

I have install GETTING_STARTED.md.
I run "python ./tools/train_net.py --cfg ./configs/dds_baselines/regnetx/RegNetX-400MF_dds_8gpu.yaml OUT_DIR ./tmp".
Get error :
Traceback (most recent call last):
File "./tools/train_net.py", line 12, in
import pycls.core.trainer as trainer
File "/home/ex/pycls-master/pycls/core/trainer.py", line 14, in
import pycls.core.builders as builders
File "/home/ex/pycls-master/pycls/core/builders.py", line 12, in
from pycls.models.anynet import AnyNet
File "/home/ex/pycls-master/pycls/models/init.py", line 10, in
from pycls.models.model_zoo import effnet, regnetx, regnety, resnet, resnext
File "/home/ex/pycls-master/pycls/models/model_zoo.py", line 12, in
import pycls.core.builders as builders
AttributeError: module 'pycls.core' has no attribute 'builders'

Thanks

How to sample models for Figure 11 in RegNet paper

Hi, I noticed that 100 models are sampled to get the results as shown in Figure 11. (sec 4).

However, as the flops in the figure span a wide range(0.2B~12.8B), I don't know whether

the total number of models in all the flops regime is 100, or

for each of the flops regime, you sampled 100 models?

Difference between RegNetX and RegNetY

What is the difference between these two models in pycls modelzoo.md

Could you provide models and their accuracy?

Hi,
Could you please release the sampled architecture configurations and their corresponding accuracy numbers during the search as in Figure 2, 3, ... 8 in the paper?

A little confused about RegNetY model

questions about the RGB order in the ImageNet dataloader

Hi,

Since you used opencv to read image, the order should be BGR.
The order of _MEAN and _SD were opposite from other implementation.
But the order of _EIG_VALS, _EIG_VECS seems to remain the same as others.
Wouldn't that cause any problem?

ref:
https://github.com/fastai/imagenet-fast/blob/master/imagenet_nv/fastai_imagenet.py
https://github.com/eladhoffer/convNet.pytorch/blob/c4b9abeb44a18f579d6d809cca5917f50b408952/preprocess.py

Different DropConnect parameters with EfficientNet?

Here : https://github.com/facebookresearch/pycls/blob/master/pycls/models/effnet.py#L142 The DC_RATIO for each block is the same in EfficientNet.
However, in the official TensorFlow implementation, they linearly increase this ratio as the block goes deep. https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/efficientnet_model.py#L687

would you tell us how to prepare imagenet dataset?

Hi,
After going through the code, I noticed this line:

pycls/pycls/datasets/imagenet.py

Line 55 in cd1cfb1

self._class_ids = sorted(

It seems that the imagenet val dataset does not have images stored in different subdirectories as does with train set. Why is the dataset implement like this? Would you please tell us how to prepare the imagenet dataset so that we could reproduce the result in the model zoo?

How to add agent?

If I change the model_builder to a trainable reinforcement learning agent and want to use the multiple gpus training code, what should I do?
Thanks!

Plan to support the design space comparison

Hi @rajprateek , @ir413 ,
Thanks for your team's great work, it provides many insights to the community. I am sure that the model zoos and the current codebase could inspire future research a lot.

I am also a little bit curious about the future plans of your codebase. So I want to ask that do you have any plans to support the design space comparison in this repo? For example, to allow users to sample & train models from different design spaces and compare these design spaces as described in the Sec. 3.1, as shown in Fig. 5, 7, and 9 in the paper. I think this feature could help the community to reproduce the comparison process and further improve this codebase's impact.

time_model.py gives different results to those in model_zoo

Hi - I appreciate there's already an open issue related to speed, but mine is slightly different.

When I run
python tools/time_net.py --cfg configs/dds_baselines/regnetx/RegNetX-1.6GF_dds_8gpu.yaml
having changed GPUS: from 8 to 1, I get the following dump. I am running this on a batch of size 64, with input resolution 224x224, on a V100, as stated in the paper.

This implies a forward pass of ~62ms, not the 33ms stated in MODEL_ZOO. Have I done something wrong? Not sure why the times are so different. The other numbers (acts, params, flops) all seem fine. The latency differences are seen for other models as well - here is 800MF (39ms vs model zoo's 21ms):

I am using commit a492b56, not the latest version of the repo, but MODEL_ZOO has not been changed since before this commit. This is because it is useful being able to time the models on dummy data, rather than having to construct a dataset. Would it be possible to have an option to do this? I can open a separate issue as a feature request for consideration if necessary.

delete

Any plans to enable transfer learning ?

`get_loss_fun` err message

pycls/pycls/core/builders.py

Line 35 in 35429e8

assert cfg.MODEL.LOSS_FUN in _loss_funs.keys(), err_str.format(cfg.TRAIN.LOSS)

assert cfg.MODEL.LOSS_FUN in _loss_funs.keys(), err_str.format(cfg.TRAIN.LOSS)

To:

assert cfg.MODEL.LOSS_FUN in _loss_funs.keys(), err_str.format(cfg.MODEL.LOSS)

Question about speed

Hello, I tested the inference speed of RegNetX8.0 and RegNetX600MF on P40. Both are in batch_size = 1, input_size = 224x224, and averaged after 50 runs.

The result is that the average inference time of RegNetX8.0 is 22ms, and the inference time of RegNetX600MF is 15ms. Why the FLOPs of the two are so different, but the reasoning time is not so different?

And the inference time of MobileNetV1 is only 3.3ms. Does RegNet have plans to launch such a fast model?

Thanks a lot~

Bottlenecked by Dataloader

Hello Everyone

I am running some experiments using pycls and despite my best efforts, I was not able to run RegNetX-200MF_dds_8gpu.yaml without being bottlenecked by the data loader.

As a minimal example I did the following:

clone the current master
set NUM_GPUS to 1 and divide BATCH_SIZE for TRAIN and TEST by 8 in RegNetX-200MF_dds_8gpu.yaml
set DATA_LOADER.NUM_WORKERS to 4 as in R-101-1x64d_step_1gpu.yaml

I ran this config on PyTorch 1.4.0, CUDA 10.1 in accordance with #79. (Full env below)
python tools/time_net.py --cfg configs/dds_baselines/regnetx/RegNetX-200MF_dds_8gpu.yaml

When I start training I get an eta of roughly 3d20h, while you were able to train the same net in 2.8h on 8 GPU - so I would expect a ballpark runtime of 20h.
python tools/train_net.py --cfg configs/dds_baselines/regnetx/RegNetX-200MF_dds_8gpu.yaml

This minimal example was run on a Dell Precision 7730. But I have the same problem when executing remotely on a server with 8 GPUs.

I am a bit lost over here so any help would be greatly appreciated!
Best Lukas

environment.yml.txt
python -m cProfile -s cumtime tools/time_net.py --cfg configs/dds_baselines/regnetx/RegNetX-200MF_dds_8gpu.yaml

What's the difference between pycls and ClassyVision

I'm little confused about this two git. Could you tell me What's the positioning of these two projects??

implementation of empirical bootstrap

Hi, thank you very much for binging this codebase and recently released model zoo. I'm really interested in your related works.

Would you please provide the empirical bootstrap implementation used in the Designing Network Design Spaces?

With AutoAugment(RandAugment) or CutMix ?

I have tried to train RegNet variants with strong augmentations, such as AutoAugment or CutMix.

But the performance can not be improved with them.

For example, I have reproduced the paper's result for RegNet-Y-400M but I got around top1 accuracy 69% with CutMix, which is way below the vanilla RegNet-Y-400M.

I also tried to train the RegNet-Y with longer epochs, but failed to improve the result.

Do you have any experience for the strong data augmentations?

No grad allreduce between multiple GPUs

thanks for your excellent job. BTW, there is no grad allreduce between multiple GPUs in this code.

may be SE width is wrong?

if se_r:
        w_se = int(round(w_in * se_r))
        self.se = SE(w_b, w_se)

in anynet.py ln 192, the w_in should change to w_b, as input width for SE block is w_b, not w_in.

Is there a search space design in the open source code?

For example, if I want to use the block in shuffleNetv2 for search space design with 60M FLOPs, is it okay?

Reproduce the result of RegNetY

Thanks for sharing the code.

I have tried to reproduce the result of RegNet-Y-400M, but failed.

I have changed num_gpus=4 (from 8 in the original configuration) and just run the command as readme suggest.

python tools/train_net.py --cfg configs/dds_baselines/regnety/RegNetY-400MF_dds_8gpu.yaml

I can get only 68%~70% top-1 accuracy, which is way below the official result.

Here is my environment.

V100 x 4
pytorch 1.6
CUDA 10.1

What could be the reason?

Should the Residual Block drop the last conv1x1 if b=1?

Hello,
As the question describe, we can drop the last conv1x1 if no bottleneck (b=1) is used. I think I read it in the paper, but I could not find it in your implementation.

We also observe that the best models use a bottleneck ratio b of 1.0 (top-middle), which effectively removes the bottleneck (commonly used in practice).

Did you try to drop either the first or last conv1x1 in your experiments?
Thank you.

The data augmentation in dataloader

Hi, thanks for this repo. In https://github.com/facebookresearch/pycls/blob/master/MODEL_ZOO.md, you said that your primary goal is to provide simple and strong baselines that are easy to reproduce. However, I found that the repo still uses PCA random lighting for data augmentation. After deleting PCA random lighting, the performance will drop, I have tested on Resnet50(23.4677 VS 23.2) and EfficientB0 (25.52 VS 24.9). I believe if this repo also has results with only basic transformations will make others easier to reproduce, considering other data loader (Nvidia-DALI) may hard to implement PCA lighting:).

Question about flops counting?

Hi, thanks for this outstanding work. I wondering if the batch norm and pooling operators should be taken into account during flops computing. I compared this counting method and OpCounter: https://github.com/Lyken17/pytorch-OpCounter. OpCounter computes flops of all operators, including batch norm and pooling, which is correct?

schedule to perfect this repo and model zoo?

Why is the learning rate different from paper?

pycls/configs/dds_baselines/effnet/EN-B0_dds_8gpu.yaml

Line 14 in 3747e12

BASE_LR: 0.4

Hi,

I noticed that in the append C of the paper of regnet, the best learning rate and weight decay is 0.1 and 5e-5 for batch size of 128. Why do you use learning rate and weight decay of 0.4 and 1e-5 at the batch size of 256? Did I miss any details of the code ?

Would you please list the accuracy of the models?

Hi,

Thanks for bring this codebase to us. I noticed that, with the configuration in this code base, we can train efficientnet-b0 with 50 epochs, which is less than that in the paper. Can this configuration result in same accuracy as reported in the paper? Would you please list the accuracy of the provided configurations ?

How to convert RegNet to ONNX ?

Hi,

Thanks a lot for sharing such good work.
Could anyone help on the onnx model exporting?

Regards,
Xing

"top1_err": 0.0000, "top5_err": 0.0000

My errors are becoming zero within a single epoch. Is this to be expected ? training with ResNeXt-101

[trainer.py: 165]: Start epoch: 1
[meters.py: 153]: json_stats: {"_type": "train_iter", "epoch": "1/100", "eta": "21,12:49:36", "iter": "10/931", "loss": 0.9199, "lr": 0.0125, "mem": 10038, "time_avg": 19.9869, "time_diff": 7.1350, "top1_err": 3.1250, "top5_err": 1.5625}
[meters.py: 153]: json_stats: {"_type": "train_iter", "epoch": "1/100", "eta": "17,03:15:09", "iter": "20/931", "loss": 0.0001, "lr": 0.0125, "mem": 10038, "time_avg": 15.9058, "time_diff": 0.4157, "top1_err": 0.0000, "top5_err": 0.0000}
[meters.py: 153]: json_stats: {"_type": "train_iter", "epoch": "1/100", "eta": "17,18:20:00", "iter": "30/931", "loss": 0.0158, "lr": 0.0125, "mem": 10038, "time_avg": 16.4908, "time_diff": 5.4462, "top1_err": 0.0000, "top5_err": 0.0000}
[meters.py: 153]: json_stats: {"_type": "train_iter", "epoch": "1/100", "eta": "16,18:25:35", "iter": "40/931", "loss": 0.0339, "lr": 0.0125, "mem": 10038, "time_avg": 15.5678, "time_diff": 0.4202, "top1_err": 0.0000, "top5_err": 0.0000}

Question about stem_w

Does the stem width parameter stem_w stay as 32 for all RegNet models, or does it follow the initial width w_0?

Why did you compare RegNet with your EfficientNet results instead of original EfficientNet results from the paper?

Why did you compare RegNet with your EfficientNet results instead of original EfficientNet results from their paper https://arxiv.org/abs/1905.11946 Table 2?
Why you didn't use enhancements (DropPath, more epochs, RMSProp, AutoAugment, ...) from Table 7 RegNet-paper for training RegNet?

EfficientNet-paper https://arxiv.org/abs/1905.11946 Table 2: EfficientNet-B5 83.6% Top1
RegNet-paper https://arxiv.org/abs/2003.13678 Table 4: EfficientNet-B5 78.5% Top1
RegNet-model zoo facebookresearch pycls url: EfficientNet-B5 78.3% Top1

Pytorch1.4 is OK; Pytorch1.2 failed

Pytorch 1.2 will meet following error:

  File "tools/train_net.py", line 255, in <module>
    main()
  File "tools/train_net.py", line 251, in main
    single_proc_train()
  File "tools/train_net.py", line 229, in single_proc_train
    train_model()
  File "tools/train_net.py", line 161, in train_model
    model = model_builder.build_model()
  File "xxxx/pycls/pycls/core/model_builder.py", line 36, in build_model
    model = _models[cfg.MODEL.TYPE]()
  File "xxxx/pycls/pycls/models/resnet.py", line 234, in __init__
    self._construct_cifar()
  File "xxxx/pycls/pycls/models/resnet.py", line 248, in _construct_cifar
    self.s1 = ResStage(w_in=16, w_out=16, stride=1, d=d)
  File "xxxx/pycls/pycls/models/resnet.py", line 164, in __init__
    super(ResStage, self).__init__()
  File "xxxx/anaconda3/envs/pytorch12/lib/python3.7/site-packages/torch/nn/modules/module.py", line 72, in __init__
    self._construct()
TypeError: _construct() missing 6 required positional arguments: 'w_in', 'w_out', 'stride', 'd', 'w_b', and 'num_gs'

Performance of config in archive folder

Hi, could you share the performance for the configs within the archive folder(especially the mode in configs/archive/imagenet/resnet)
Thanks a lot.

Is warm-up used when models are trained for 10 epochs

Hi,

Is warm-up used in the low-epoch regime? If yes, what's the number of warm-up epochs?

Thank you!

Figure 15. vs Eqn. (2-4)

Hi, I'm confused when I check the numbers in the Figure 15.

Take the RegNetX-3.2GF for example. The params are as following

d = [2, 6, 15, 2]
w = [96, 192, 432, 1008]
wa = 26
w0 = 88
wm = 2.2

I can't get the w = 96 for the first stage through either Eqn. 2 wj = w0 +wa * j or Eqn.4 wj = w0 * wm ^ sj.