Giter Site home page Giter Site logo

single-path-one-shot-nas's Introduction

Single-Path-One-Shot-NAS

license python pytorch

This repo provides a Pytorch-based implementation of SPOS(Single Path One-Shot Neural Architecture Search with Uniform Sampling) by Zichao Guo, and et. al. SPOS

It only contains 'Block Search' for reference. It's very time consuming to train this supernet on ImageNet, which makes it impossible for me to finish the experiment under limited resources. Therefore, I mainly focused on CIFAR-10.

Great thanks to Zichao Guo for his advice on some details. Nevertheless, some differences may still exists when compared with the official version such as data preprocessing and some other hyper parameters.

Environments

Python==3.7.10, Pytorch==1.7.1, CUDA==10.2, cuDNN==7.6.5 

Dataset

CIFAR-10 can be automatically downloaded using this code. ImageNet needs to be manually downloaded and here are some instructions.

Usage

  1. Train a supernet on the CIFAR-10 dataset by simply running:
bash scripts/train_supernet.sh
  • My pretrained supernet can be downloaded from this link.
  1. For convenience, I conduct random search by enumerating 1,000 paths to select the best:
bash scripts/random_search.sh
  • During my search, the best path is [1, 0, 3, 1, 3, 0, 3, 0, 0, 3, 3, 0, 1, 0, 1, 2, 2, 1, 1, 3].
  • In the original SPOS paper, they adopted the evolutionary algorithm to search architectures. Please refer to their official repo for more details.
  1. Use the best searched path to modify the "choice" defined in Line 116 of retrain_best_choice.py and re-train the corresponding architecture of this path:
bash scripts/retrain_best_choice.py
  • After retraining, the best test accuracy of this searched architecture is 93.31. The checkpoint is provided here.
  1. As I fix all seeds in the above procedures, same results should be achieved. You can check my logs in the logdir.

Reference

[1] Single Path One-Shot Neural Architecture Search with Uniform Sampling

[2] Differentiable architecture search for convolutional and recurrent networks

single-path-one-shot-nas's People

Contributors

shunlu91 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

single-path-one-shot-nas's Issues

请问最终的最高精度能达到多少?

README中只写从Supernet随机采样的1000个子网络的精度,请问结构搜索之后的最优子结构finetune or train from scratch的最高的精度能否超过原始ShuffleNet的精度?

model test

I have a question about code, how to test when model have been trained? because you do not konw architecture of sampleing model.

net output dimension mismatch on cifar10

hey there:

when i run python train_CIFAR-10.py

Batch_size=128
so the output of network should be (128,10)
i got (512,10)

ValueError: Expected input batch_size (512) to match target batch_size (128).

about extracting final result

thx for updating dude:
I run the train_cifar10 code
there is nothing but model_save function after train_search.
I guess the final result after train_search should be like [3 3 1 3 0 1 0 2 2 2 0 3 1 1 0 1 2 1 1 1]
,right?

we may load that configuration and Get Searched Architecture like:
config = [3 3 1 3 0 1 0 2 2 2 0 3 1 1 0 1 2 1 1 1]
outputs = model(inputs, config)
am i right?

then we can train it on cifar10 to see how good is
maybe I can do this.

one more question:
I found
random = np.random.randint(4, size=20)
outputs = model(inputs, random)
both in train and validate function,
do u mean these two random configrations should be same or not?

请问下,在block selection时,不同的branch之间会共享权重么?

您好,非常感谢您的贡献,在阅读您的代码的时候,不同branch之间是不共享权重的,但是在论文中说了一个共享权重的问题,即预先创建一个很大的权重,然后分割给不同的branch。请问下这个共享权重是仅仅在channel selection时候才有还是您没有实现呢

Randomly sample and save models

Hi,
Thank you for this work. I am hoping to use it to carry out some experiments.

I was hoping to use the pre-trained CIFAR-10 supernet to sample some 1k architectures (randomly) and save their checkpoints/weights. My objective is to save these random models, modify the weights and then do the validation.

Is it possible for you to elaborate on how I can save the randomly sampled weights?

您好,关于精度复现

打扰一下:

在运行600个Epoch后,并没有在val set上面达到你绘制的 精度分布图。请问你提供的这个分布图是 train set上面的吗?

运行完 supernet.py 后,如何进行模型加载校验(val)?

首先感谢您的复现
我运行完 supernet.py 后,并不知道最佳的 choice 是哪一个,请问哪里可以知道最后超网的 choice?
然后如何加载模型进行校验,我尝试运行 random_search.py,默认加载 supernet.py 产出的模型,会报错
我理解 choice_model.py 为固定一个 choice 进行训练,和 supernet.py, random_search.py 三者有何不同?

Can not find cifar_train.py

In your usage, you wrote that:
python cifar_train.py exp_name spso_cifar

But I can not find cifar_train.py in your repository.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.