Giter Site home page Giter Site logo

toybox-rs / toybox Goto Github PK

View Code? Open in Web Editor NEW
27.0 13.0 12.0 7.84 MB

The Machine Learning Toybox for testing the behavior of autonomous agents.

Home Page: http://toybox.rs

Python 59.14% Shell 0.72% Dockerfile 0.05% HTML 40.09%
causality explanation explainable-ai xai atari reinforcement-learning

toybox's Introduction

The Reinforcement Learning Toybox CI

A set of games designed for testing deep RL agents. This repo contains Python wrappers and an intervention API for Toybox games. Python wrappers for the Atari games are constructed to mock the Arcade Learning Environment and subclass the gym.envs.atari.AtariEnv wrapper. ToyboxBaseEnv may be a good entry point for the gym wrappers.

If you use this code, or otherwise are inspired by our white-box testing approach, please cite our NeurIPS workshop paper:

@inproceedings{foley2018toybox,
  title={{Toybox: Better Atari Environments for Testing Reinforcement Learning Agents}},
  author={Foley, John and Tosch, Emma and Clary, Kaleigh and Jensen, David},
  booktitle={{NeurIPS 2018 Workshop on Systems for ML}},
  year={2018}
}

We have a lenghtier paper on ArXiV and can provide a draft of a non-public paper on our acceptance testing framework by request (email at etosch at cs dot umass dot edu).

How accurate are your games?

Watch four minutes of agents playing each game. Both ALE implementations and Toybox implementations have their idiosyncracies, but the core gameplay and concepts have been captured. Pull requests always welcome to improve fidelity.

Where is the actual Rust code?

The rust implementations of the games have moved to a different repository: toybox-rs/toybox-rs

Installation

  1. Create a virtual environment using your python3 installation: ${python} -m venv venv
    • OSX
      • On OSX ${python}, this is likely python3: thus, your command will be python3 -m venv venv
      • If you are not sure of your version, run python --version
    • Windows (not fully tested!)
      • If you are on Windows, your command will likely be: {python}
  2. Activate your virtual environment:
    • BSD-ish: source venv/bin/activate
    • Windows: venv/Scripts/activate
  3. Install Toybox:
pip install ctoybox
pip install git+https://github.com/toybox-rs/Toybox

Note: if you are trying to run from Windows, you will need to build from source. See instructions for building here. 4. Install requirements: run pip install -r REQUIREMENTS.txt 5. Run python setup.py install

Play the games (using pygame)

pip install ctoybox pygame
python -m ctoybox.human_play breakout
python -m ctoybox.human_play amidar
python -m ctoybox.human_play space_invaders

Run the tests

Sample behavioral tests developed with Toybox are frozen and available here. These tests are featured with an OpenAI baselines integration to facilitate off-the-shelf model training.

Python

Tensorflow, OpenAI Gym, OpenCV, and other libraries may or may not break with various Python versions. We have confirmed that the code in this repository will work with the following Python versions:

  • 3.6, 3.7

Get starting images for reference from ALE / atari_py

./scripts/utils/start_images --help

toybox's People

Contributors

etosch avatar jjfiv avatar kclary avatar miffyli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

toybox's Issues

Fix sprite rendering in human_play

Right now it just shows it with a rectangle. Maybe we can even use the software rendering to a texture for this backend, so it always stays in sync?

GridWorld game environment

@kclary says this is a very common RL benchmark, and should serve as a simpler, tutorial-esque game in our toybox that we could direct people who want to implement their own to.

Assigning to me for now.

toybox.py: toybox.get_score() returning zero

@kclary was working on this, now with the graphics up and running I can tell that toybox (rust) knows what the score is but it's getting lost somehow on the way out. Maybe we just need a python/ctypes type annotation?

Testing for paper

Breakout

  • last brick
  • ez channel
  • polar angles

Amidar

  • last segment
  • ez caught
  • enemy avoidance

Stealing @kclary's board notation.

Breakout paddle should allow user to control bounce

This is not real physics, but usually people alter the velocity of the ball coming off the paddle as if the paddle were really curved -- this allows the player to get all the bricks (otherwise it just bounces at the same angle forever).

Score behavior after end of game

While watching trained models play both Breakout and Amidar, occasionally during game-over, the score will jump up from what was last seen on screen. Could be due to the 4-step concatenation before checking whether the player died? Seems to be many 4-step blocks between when you see the agent die and the screen go black, and when the rendering step sees that the game is over - that's when the score increase happens.

For Amidar, these can be large increases (e.g. 74 - 125, 278 - 379)

Migrate Breakout to use real physics computation

Rust has a nice collision API here: ncollide2d that would allow us to trivially add arbitrary shapes to our breakout game. It also supports time of impact style queries so that when the ball ends up going too fast it won't "make mistakes".

Right now, we can just simulate tiny timesteps in our game, but that's going to be less efficient than using a real solver...

Identify agent and training regimen that takes the least amount of time

We need to evaluate our system by training an agent on both our system and the OpenAI gym one. It doesn't matter how good the agent is, so long as it does well enough that we can compare runtime performance meaningfully. Therefore, we should identify the setup that takes the least amount time some, so that we can iterate rapidly.

Amidar enemies are too fast

Probably explains why the agent sucks? idk, player and enemy speeds are sync'd so that should be ok, but I no longer know anything.

implement configurable input

https://github.com/KDL-umass/Amidar
-- everything is parameterized here. right now the rust implementation is totally hard-coded. It should require input files the way that Amidar does. This should be done in a branch until we are done training, since it is only needed for experimentation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.