Giter Site home page Giter Site logo

deepmind-atari's Introduction

----- DQN 3.0 -----

This project contains the source code of DQN 3.0, a Lua-based deep reinforcement
learning architecture, necessary to reproduce the experiments
described in the paper "Human-level control through deep reinforcement
learning", Nature 518, 529–533 (26 February 2015) doi:10.1038/nature14236.

To replicate the experiment results, a number of dependencies need to be
installed, namely:
    * LuaJIT and Torch 7.0
    * nngraph
    * Xitari (fork of the Arcade Learning Environment (Bellemare et al., 2013))
    * AleWrap (a lua interface to Xitari)
An install script for these dependencies is provided.

Two run scripts are provided: run_cpu and run_gpu. As the names imply,
the former trains the DQN network using regular CPUs, while the latter uses
GPUs (CUDA), which typically results in a significant speed-up.



----- Installation instructions -----

The installation requires Linux with apt-get.

Note: In order to run the GPU version of DQN, you should additionally have the
NVIDIA® CUDA® (version 5.5 or later) toolkit installed prior to the Torch
installation below.
This can be downloaded from https://developer.nvidia.com/cuda-toolkit
and installation instructions can be found in
http://docs.nvidia.com/cuda/cuda-getting-started-guide-for-linux


To train DQN on Atari games, the following components must be installed:
    * LuaJIT and Torch 7.0
    * nngraph
    * Xitari
    * AleWrap

To install all of the above in a subdirectory called 'torch', it should be enough to run

    ./install_dependencies.sh

from the base directory of the package.


Note: The above install script will install the following packages via apt-get:
build-essential, gcc, g++, cmake, curl, libreadline-dev, git-core, libjpeg-dev,
libpng-dev, ncurses-dev, imagemagick, unzip



----- Training DQN on Atari games -----

Prior to running DQN on a game, you should copy its ROM in the 'roms' subdirectory.
It should then be sufficient to run the script

    ./run_cpu <game name>

Or, if GPU support is enabled,

    ./run_gpu <game name>


Note: On a system with more than one GPU, DQN training can be launched on a
specified GPU by setting the environment variable GPU_ID, e.g. by

    GPU_ID=2 ./run_gpu <game name>

If GPU_ID is not specified, the first available GPU (ID 0) will be used by default.



----- Options ------

Options to DQN are set within run_cpu (respectively, run_gpu). You may,
for example, want to change the frequency at which information is output 
to stdout by setting 'prog_freq' to a different value.

deepmind-atari's People

Contributors

soumith avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepmind-atari's Issues

Segfault when running ./run_gpu and ./run_cpu

I followed the instructions in README.md after creating a new user account atari due to some version issues with my pre-existing installation of torch.

After setting ulimit -c unlimited and running gdb ../torch/bin/luajit core, I get the following clue as to what is causing the segfault.

Core was generated by `../torch/bin/luajit train_agent.lua -framework alewrap -game_path /home/atari/d'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fab3edf1d64 in ale::StellaEnvironment::emulate(ale::Action, ale::Action, unsigned long) () from /home/atari/deepmind-atari/torch/lib/libxitari.so

I'd like to continue diving into the problem, but I could use some help on how to proceed. Should I be raising this issue with https://github.com/deepmind/xitari instead of here since xitari seems to be behind my issue?

Moving from SpatialConvolutionCUDA to SpatialConvolution possibly broke things?

Hi!

Thank you for your fork of this project, and thank you for all your brilliant work on Torch.

I cloned this repository and ran it on one of my Titan Z GPUs for two to three days straight. It's been learning to play Breakout. However, when I use the resulting network file, enable graphics, and use qlua to display visual output, the agent seems to play no more intelligently than an untrained version -- it never scores more than a few points in a row. Here are the steps I took (source):

  1. Cloned deepmind-atari.
  2. Changed ../torch/bin/luajit train_agent.lua $args to ../torch/bin/qlua train_agent.lua $args in run_gpu.
  3. Changed display=false to display=true in torch/share/lua/5.1/alewrap/AleEnv.lua
  4. Ran ./run_gpu breakout for two to three days. This produced the network file dqn/DQN3_0_1_breakout_FULL_Y.t7.
  5. Changed netfile="\"convnet_atari3\"" in run_gpu to netfile="\"DQN3_0_1_breakout_FULL_Y.t7\"".

My machine reached approximately 12 million steps. Maybe this is not enough?

I'm concerned that change 88fffea may have broken this demo. Is this at all possible? Or do you have any suggestions on what might be wrong? I greatly appreciate any input!

Running issue

When I tried to run a code I've got:

-framework alewrap -game_path /home/cave/atari22/deepmind-atari/roms/ -name DQN3_0_1_breakout_FULL_Y -env breakout -env_params useRGB=true -agent NeuralQLearner -agent_params lr=0.00025,ep=1,ep_end=0.1,ep_endt=replay_memory,discount=0.99,hist_len=4,learn_start=50000,replay_memory=1000000,update_freq=4,n_replay=1,network="convnet_atari3",preproc="net_downsample_2x_full_y",state_dim=7056,minibatch_size=32,rescale_r=1,ncols=1,bufferSize=512,valid_size=500,target_q=10000,clip_delta=1,min_reward=-1,max_reward=1 -steps 50000000 -eval_freq 250000 -eval_steps 125000 -prog_freq 5000 -save_freq 125000 -actrep 4 -gpu -1 -random_starts 30 -pool_frms type="max",size=2 -seed 1 -threads 4
../torch/bin/luajit: ./initenv.lua:8: module 'torch' not found:
    no field package.preload['torch']
    no file '/root/.luarocks/share/lua/5.1/torch.lua'
    no file '/root/.luarocks/share/lua/5.1/torch/init.lua'
    no file '/root/torch/install/share/lua/5.1/torch.lua'
    no file '/root/torch/install/share/lua/5.1/torch/init.lua'
    no file './torch.lua'
    no file '/root/torch/install/share/luajit-2.1.0-beta1/torch.lua'
    no file '/usr/local/share/lua/5.1/torch.lua'
    no file '/usr/local/share/lua/5.1/torch/init.lua'
    no file '/root/torch/install/lib/torch.so'
    no file '/root/.luarocks/lib/lua/5.1/torch.so'
    no file '/root/torch/install/lib/lua/5.1/torch.so'
    no file './torch.so'
    no file '/usr/local/lib/lua/5.1/torch.so'
    no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
    [C]: in function 'require'
    ./initenv.lua:8: in main chunk
    [C]: in function 'require'
    train_agent.lua:8: in main chunk
    [C]: at 0x00406260

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.