Giter Site home page Giter Site logo

muskanmahajan37 / scalable_agent Goto Github PK

View Code? Open in Web Editor NEW

This project forked from google-deepmind/scalable_agent

0.0 0.0 0.0 54 KB

A TensorFlow implementation of Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures.

License: Apache License 2.0

C++ 15.12% Python 82.51% Dockerfile 2.38%

scalable_agent's Introduction

Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

This repository contains an implementation of "Importance Weighted Actor-Learner Architectures", along with a dynamic batching module. This is not an officially supported Google product.

For a detailed description of the architecture please read our paper. Please cite the paper if you use the code from this repository in your work.

Bibtex

@inproceedings{impala2018,
  title={IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures},
  author={Espeholt, Lasse and Soyer, Hubert and Munos, Remi and Simonyan, Karen and Mnih, Volodymir and Ward, Tom and Doron, Yotam and Firoiu, Vlad and Harley, Tim and Dunning, Iain and others},
  booktitle={Proceedings of the International Conference on Machine Learning (ICML)},
  year={2018}
}

Running the Code

Prerequisites

TensorFlow >=1.9.0-dev20180530, the environment DeepMind Lab and the neural network library DeepMind Sonnet. Although we use DeepMind Lab in this release, the agent has been successfully applied to other domains such as Atari, Street View and has been modified to generate images.

We include a Dockerfile that serves as a reference for the prerequisites and commands needed to run the code.

Single Machine Training on a Single Level

Training on explore_goal_locations_small. Most runs should end up with average episode returns around 200 or around 250 after 1B frames.

python experiment.py --num_actors=48 --batch_size=32

Adjust the number of actors (i.e. number of environments) and batch size to match the size of the machine it runs on. A single actor, including DeepMind Lab, requires a few hundred MB of RAM.

Distributed Training on DMLab-30

Training on the full DMLab-30. Across 10 runs with different seeds but identical hyperparameters, we observed between 45 and 50 capped human normalized training score with different seeds (--seed=[seed]). Test scores are usually an absolute of ~2% lower.

Learner

python experiment.py --job_name=learner --task=0 --num_actors=150 \
    --level_name=dmlab30 --batch_size=32 --entropy_cost=0.0033391318945337044 \
    --learning_rate=0.00031866995608948655 \
    --total_environment_frames=10000000000 --reward_clipping=soft_asymmetric

Actor(s)

for i in $(seq 0 149); do
  python experiment.py --job_name=actor --task=$i \
      --num_actors=150 --level_name=dmlab30 --dataset_path=[...] &
done;
wait

Test Score

python experiment.py --mode=test --level_name=dmlab30 --dataset_path=[...] \
    --test_num_episodes=10

scalable_agent's People

Contributors

draichi avatar lespeholt avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.