Giter Site home page Giter Site logo

cownowan / dass Goto Github PK

View Code? Open in Web Editor NEW
21.0 3.0 1.0 234 KB

Official PyTorch implementation of "Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets" (ICLR 2023 notable top 25%)

Python 98.31% Shell 1.69%
deep-learning efficient-deep-learning knowledge-distillation machine-learning meta-learning neural-architecture-search pytorch

dass's Introduction

Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets

This is the official PyTorch implementation for the paper Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets (ICLR 2023 Notable-top-25% Spotlight) : https://openreview.net/forum?id=SEh5SfEQtqB.

Abstract

Distillation-aware Network Architecture Search (DaNAS) aims to search for an optimal student architecture that obtains the best performance and/or efficiency when distilling the knowledge from a given teacher model. Previous DaNAS methods have mostly tackled the search for the network architecture for fixed source/target tasks and the teacher, which are not generalized well on a new task, thus need to perform a costly search for any new combination of the domains and the teachers. For standard NAS tasks without KD, meta-learning-based computationally efficient NAS methods have been proposed, which learn the generalized search process over multiple tasks and transfer the knowledge obtained over those tasks to a new task. However, since they assume learning from scratch without KD from a teacher, they might not be ideal for DaNAS scenarios, which could significantly affect the final performances of the architectures obtained from the search. To eliminate the excessive computational cost of DaNAS methods and the sub-optimality of rapid NAS methods, we propose a distillation-aware meta accuracy prediction model, DaSS (Distillation-aware Student Search), which can predict a given architecture’s final performances on a dataset when performing KD with a given teacher, without having actually to train it on the target task. The experimental results demonstrate that our proposed meta-prediction model successfully generalizes to multiple unseen datasets for DaNAS tasks, largely outperforming existing meta-NAS methods and rapid NAS baselines.

Framework of DaSS Model

Installation

$ conda env create -f dass.yaml

Download datasets, checkpoints, preprocessed features

$ python download/download_dataset.py --name all
$ python download/download_checkpoint.py 
$ python download/download_preprocessed.py 
$ rm preprocessed.zip

Meta-training DaSS model

You can download trained checkpoint files for generator and predictor

$ bash script/run_meta_train.sh [GPU_NUM]

Rapid search using meta-learned DaSS model on unseen datasets

# bash script/run_search.sh [GPU_NUM]

Knowledge distillation (KD) for searched student on target unseen dataset

$ bash script/run_kd.sh GPU_NUM DATASET_NAME
$ bash script/run_kd.sh 0 cub
$ bash script/run_kd.sh 0 dtd
$ bash script/run_kd.sh 0 quickdraw
$ bash script/run_kd.sh 0 stanford_cars

Citation

If you found the provided code useful, please cite our work.

@article{lee2023meta,
  title={Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets},
  author={Lee, Hayeon and An, Sohyun and Kim, Minseon and Hwang, Sung Ju},
  journal={arXiv preprint arXiv:2305.16948},
  year={2023}
}

dass's People

Contributors

cownowan avatar hayeonlee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

lliai

dass's Issues

DiffusionNAG

Dear Hayeon Lee,

I hope this message finds you well. I recently came across your work titled "DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models," and I was very impressed by the innovative approach and the promising results presented. The integration of diffusion models for neural architecture generation is particularly intriguing, and I believe it represents a significant advancement in the field.

I am keenly interested in exploring and understanding the methodologies and implementations detailed in your paper. I am currently attempting to replicate your work for further study and application within our projects.

However, I've encountered a challenge with opening and processing the datasets as described in your methodology. It seems there may be specific requirements or steps that I am missing, leading to difficulties in correctly accessing the data. This has impeded our progress and ability to fully appreciate the nuances of your work.

Given this, I was wondering if it might be possible for you to share the codebase or any related materials that could assist in resolving this issue. Additionally, any guidance or insights you could provide regarding the correct handling of the datasets would be immensely valuable. Your expertise and any further elaborations on the implementation details would greatly aid in our understanding and ability to replicate your impressive results.

Thank you very much for considering my request. Your work has sparked significant interest within our team, and we are eager to delve deeper into its potential applications. I look forward to the possibility of hearing back from you and perhaps even collaborating in some capacity in the future.

Warm regards,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.