Giter Site home page Giter Site logo

simon821 / enas Goto Github PK

View Code? Open in Web Editor NEW

This project forked from melodyguan/enas

0.0 1.0 0.0 3.88 MB

TensorFlow Code for paper "Efficient Neural Architecture Search via Parameter Sharing"

Home Page: https://arxiv.org/abs/1802.03268

License: Apache License 2.0

Python 95.90% Shell 4.10%

enas's Introduction

Efficient Neural Architecture Search via Parameter Sharing

Authors' implementation of "Efficient Neural Architecture Search via Parameter Sharing" (2018) in TensorFlow.

Includes code for CIFAR-10 image classification and Penn Tree Bank language modeling tasks.

Paper: https://arxiv.org/abs/1802.03268

Authors: Hieu Pham*, Melody Y. Guan*, Barret Zoph, Quoc V. Le, Jeff Dean

This is not an official Google product.

Penn Treebank

The Penn Treebank dataset is included at data/ptb. Depending on the system, you may want to run the script data/ptb/process.py to create the pkl version. All hyper-parameters are specified in these scripts.

To run the ENAS search process on Penn Treebank, please use the script

./scripts/ptb_search.sh

To run ENAS with a determined architecture, you have to specify the archiecture using a string. The following is an example script for using the architecture we described in our paper.

./scripts/ptb_final.sh

A sequence of architecture for a cell with N nodes can be specified using a sequence a of 2N + 1 tokens

  • a[0] is a number in [0, 1, 2, 3], specifying the activation function to use at the first cell: tanh, ReLU, identity, and sigmoid.
  • For each i, a[2*i] specifies a previous index and a[2*i+1] specifies the activation function at the i-th cell.

For a concrete example, the following sequence specifies the architecture we visualize in our paper

0 0 0 1 1 2 1 2 0 2 0 5 1 1 0 6 1 8 1 8 1 8 1

CIFAR-10

To run the experiments on CIFAR-10, please first download the dataset. Again, all hyper-parameters are specified in the scripts that we descibe below.

To run the ENAS experiments on the macro search space as described in our paper, please use the following scripts:

./scripts/cifar10_macro_search.sh
./scripts/cifar10_macro_final.sh

A macro architecture for a neural network with N layers consists of N parts, indexed by 1, 2, 3, ..., N. Part i consists of:

  • A number in [0, 1, 2, 3, 4, 5] that specifies the operation at layer i-th, corresponding to conv_3x3, separable_conv_3x3, conv_5x5, separable_conv_5x5, average_pooling, max_pooling.
  • A sequence of i - 1 numbers, each is either 0 or 1, indicating whether a skip connection should be formed from a the corresponding past layer to the current layer.

A concrete example can be found in our script ./scripts/cifar10_macro_final.sh.

To run the ENAS experiments on the micro search space as described in our paper, please use the following scripts:

./scripts/cifar10_micro_search.sh
./scripts/cifar10_micro_final.sh

A micro cell with B + 2 blocks can be specified using B blocks, corresponding to blocks numbered 2, 3, ..., B+1, each block consists of 4 numbers

index_1, op_1, index_2, op_2

Here, index_1 and index_2 can be any previous index. op_1 and op_2 can be [0, 1, 2, 3, 4], corresponding to separable_conv_3x3, separable_conv_5x5, average_pooling, max_pooling, identity.

A micro architecture can be specified by two sequences of cells concatenated after each other, as shown in our script ./scripts/cifar10_micro_final.sh

Citations

If you happen to use our work, please consider citing our paper.

@inproceedings{enas,
  title     = {Efficient Neural Architecture Search via Parameter Sharing},
  author    = {Pham, Hieu and
               Guan, Melody Y. and
               Zoph, Barret and
               Le, Quoc V. and
               Dean, Jeff
  },
  booktitle = {ICML},
  year      = {2018}
}

enas's People

Contributors

hyhieu avatar melodyguan avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.