shunlu91 / single-path-one-shot-nas Goto Github PK

SPOS(Single Path One-Shot Neural Architecture Search with Uniform Sampling) rebuilt in Pytorch with single GPU.

Python 97.75% Shell 2.25%

single-path-one-shot-nas's Introduction

Single-Path-One-Shot-NAS

This repo provides a Pytorch-based implementation of SPOS(Single Path One-Shot Neural Architecture Search with Uniform Sampling) by Zichao Guo, and et. al.

It only contains 'Block Search' for reference. It's very time consuming to train this supernet on ImageNet, which makes it impossible for me to finish the experiment under limited resources. Therefore, I mainly focused on CIFAR-10.

Great thanks to Zichao Guo for his advice on some details. Nevertheless, some differences may still exists when compared with the official version such as data preprocessing and some other hyper parameters.

Environments

Python==3.7.10, Pytorch==1.7.1, CUDA==10.2, cuDNN==7.6.5

Dataset

CIFAR-10 can be automatically downloaded using this code. ImageNet needs to be manually downloaded and here are some instructions.

Usage

Train a supernet on the CIFAR-10 dataset by simply running:

bash scripts/train_supernet.sh

My pretrained supernet can be downloaded from this link.

For convenience, I conduct random search by enumerating 1,000 paths to select the best:

bash scripts/random_search.sh

During my search, the best path is [1, 0, 3, 1, 3, 0, 3, 0, 0, 3, 3, 0, 1, 0, 1, 2, 2, 1, 1, 3].
In the original SPOS paper, they adopted the evolutionary algorithm to search architectures. Please refer to their official repo for more details.

Use the best searched path to modify the "choice" defined in Line 116 of retrain_best_choice.py and re-train the corresponding architecture of this path:

bash scripts/retrain_best_choice.py

After retraining, the best test accuracy of this searched architecture is 93.31. The checkpoint is provided here.

As I fix all seeds in the above procedures, same results should be achieved. You can check my logs in the logdir.

Reference

[1] Single Path One-Shot Neural Architecture Search with Uniform Sampling

[2] Differentiable architecture search for convolutional and recurrent networks

single-path-one-shot-nas's People

Contributors

Stargazers

Watchers

Forkers

zhouyou214737 ddeeppnneett tangzedong wwwanghao huizhang0110 qjziyou 5663015 xuecaihu lucaskyle mathpopo zhuikonger robot-ai-machinelearning forks-learning luoluojinjiner andrew-zhu xingxing-123 sincere-choi curiouscat-7 cswaynecool light-reflection zhanzheng8585 fabianritter tianxiang999 whatissimondoing simonzsx lovegood-1 johsnows jie311 klhhhhh sunzh1996 phamasaur 13015517713 asrua jachinjiang mosout songshuhan tools-only collinzrj howiehsu0126

single-path-one-shot-nas's Issues

请问最终的最高精度能达到多少？

README中只写从Supernet随机采样的1000个子网络的精度，请问结构搜索之后的最优子结构finetune or train from scratch的最高的精度能否超过原始ShuffleNet的精度？

model test

I have a question about code, how to test when model have been trained? because you do not konw architecture of sampleing model.

net output dimension mismatch on cifar10

hey there:

when i run python train_CIFAR-10.py

Batch_size=128
so the output of network should be (128,10)
i got (512,10)

ValueError: Expected input batch_size (512) to match target batch_size (128).

How to do channel search?

Why do i maintain a 10% accuracy rate when training this model?

I replace the dataset with Cifar10 ,and reduce the number of layers.When i training this model ,the accuracy rate maintain 10%.It can't change

Why shuffle_channels function is not used?

I find that shuffle_channels function is not used. Is this function useless?

about extracting final result

thx for updating dude:
I run the train_cifar10 code
there is nothing but model_save function after train_search.
I guess the final result after train_search should be like [3 3 1 3 0 1 0 2 2 2 0 3 1 1 0 1 2 1 1 1]
,right?

we may load that configuration and Get Searched Architecture like:
config = [3 3 1 3 0 1 0 2 2 2 0 3 1 1 0 1 2 1 1 1]
outputs = model(inputs, config)
am i right?

then we can train it on cifar10 to see how good is
maybe I can do this.

one more question:
I found
random = np.random.randint(4, size=20)
outputs = model(inputs, random)
both in train and validate function,
do u mean these two random configrations should be same or not?

请问下，在block selection时，不同的branch之间会共享权重么？

您好，非常感谢您的贡献，在阅读您的代码的时候，不同branch之间是不共享权重的，但是在论文中说了一个共享权重的问题，即预先创建一个很大的权重，然后分割给不同的branch。请问下这个共享权重是仅仅在channel selection时候才有还是您没有实现呢

关于Mixed-Precision Quantization

请问有关于Mixed-Precision Quantization这个application的实现吗？谢谢

Randomly sample and save models

Hi,
Thank you for this work. I am hoping to use it to carry out some experiments.

I was hoping to use the pre-trained CIFAR-10 supernet to sample some 1k architectures (randomly) and save their checkpoints/weights. My objective is to save these random models, modify the weights and then do the validation.

Is it possible for you to elaborate on how I can save the randomly sampled weights?

BN校准

请问为什么不在搜索评估子网的时候不进行BN校准？在https://github.com/megvii-model/SinglePathOneShot/blob/master/src/Search/tester.py#L52 代码中有对BN进行校准，请问不进行BN校准这样会不会带来较大的偏差？

您好，关于精度复现

打扰一下：

在运行600个Epoch后，并没有在val set上面达到你绘制的 精度分布图。请问你提供的这个分布图是 train set上面的吗？