xiangning-chen / smoothdarts Goto Github PK

View Code? Open in Web Editor NEW

75.0 75.0 12.0 1.09 MB

Code for our ICML'2020 paper "Stabilizing Differentiable Architecture Search via Perturbation-based Regularization"

Python 86.46% Jupyter Notebook 13.54%

smoothdarts's People

Contributors

Stargazers

Watchers

Forkers

tilmto lliai ruocwang jmkim0309 shunlu91 andrew-zhu xrosliang yuezhixiong mirofil jayaneetha ramtinhoss

smoothdarts's Issues

How to draw the fig???

Hi, Xiangning,

I am very interested in your paper and your code. Great job!
My concern is how to draw the figure1 in your paper.

Look forward to your explanation.

ImageNet training details

Dear authors,

Thanks for providing the code of SDARTS.

I noticed that in sota/cnn/train_imagenet.py the batch_size is set to 1024. Does that mean the ImageNet results of SDARTS in the paper were obtained with 8-GPU training, i.e., the same setup used by PC-DARTS / P-DARTS?

Thanks!

Question about reproduce SmoothDARTS on PTB

Hi, xiangning.

Thanks for your great work and opensoure code.

I evaluate(training for 300 epoches) the initial architecture and the architectures after searching for 50 epoch by random and pgd repectively. The results are weird as below.

The architecture does not improve after searching in 4 runs

The random architecture seems better than your searched result.
And I evaluate your results in the last line. Every initial architecture is better than your random architecture and one of initial architecture is better than your pgd architecture.

I evaluate architecture after training 50 epochs, did you use early stopping ?

Could you please help me figure it out? Thx!

pytorch&torchvision version

Hi,
Thanks for your good working.
I'm trying to run this model.But I have a question for pytorch and torchvision version.
Would you please tell me which pytorch and torchvision version used in experiment?

s1~s4 training search setting

Hi, thanks for your release code.
I ran the experiments about searching/evaluating on s1~s4 on Cifar10, but they got imperfect results as paper's.
Can you share more about hyperparameter setting on these? Thanks!

why updateType = 'weight'?

Hi, would you please explain why you bypass softmax in the forward computation while applying projected gradient descent to maximize the loss wrt arch_parameters? How is the performance compared to updateType='alpha'

Thanks!

Did you use unrolling in the experiments?

Hi, thank you for sharing your code. I have a question about the unrolling.

SmoothDARTS/sota/cnn/train_search.py

Line 52 in 83710f1

    
           parser.add_argument('--unrolled', action='store_true', default=False, help='use one-step unrolled validation loss')

This flag is false by default, and the execution command presented in README.md is, for example, SDARTS-RS: cd sota/cnn && python train_search.py --search_space=s1 --perturb_alpha=random. Did you use unrolled in the main experiments?

ImageNet training gets "Killed" response

I have been attempting to run the imagenet_train program on Linux, and when I try I only get to this stage before it displays "Killed." The computer I am using has 31.1 GB of memory, which is already more than the previous computer I tried this on, so I'm not sure how much I actually need.

Parameter-free ops still dominate the generated architecture?

Hi.

Thanks for your release code. Great job!
But when I ran the experiments on s1(Cifar10), it seemed to get completely different results from the paper: No matter I use random or pgd_linf method, almost all operations I got when the search finished still are skip_conn. Are there some special tricks or hyperparameter setting that are not mentioned during the experiment?

Look forward to your reply.Thanks.

xiangning-chen / smoothdarts Goto Github PK

smoothdarts's People

Contributors

Stargazers

Watchers

Forkers

smoothdarts's Issues

How to draw the fig???

ImageNet training details

Question about reproduce SmoothDARTS on PTB

pytorch&torchvision version

s1~s4 training search setting

why updateType = 'weight'?

Did you use unrolling in the experiments?

ImageNet training gets "Killed" response

Parameter-free ops still dominate the generated architecture?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent