Giter Site home page Giter Site logo

a2c's Introduction

Project Details


In this project, an Advantage Actor Critic (A2C) network is trained to control automatic arms that will try to touch and follow the moving balls.

img

The Environment

The environment for this project involves controlling a double-jointed arm to reach target locations.

State Space

state is continuous, the state vector has 33 dimensions, corresponding to position, rotation, velocity, and angular velocities of the arm.

Action Space

Each action is a vector with 4 numbers, corresponding to torque applicable to two joints. Every entry in the action vector must be a number between -1 and 1.

Reward

A reward of +0.1 is provided for each step that the agent's hand is in the goal location.

Goal

maintain the agent's hand at the target location for as many time steps as possible.

Solving the Environment

An average score of +30 over 100 consecutive episodes, and over all agents.

* The version with 20 identical copies of the agent sharing the same experience is used in this experiment.

Getting Started

Step 1: Clone the Project and Install Dependencies

*Please prepare a python3 virtual environment if necessary.

git clone https://github.com/qiaochen/A2C.git
cd install_requirements
pip install .

Step 2: Download the Unity Environment

For this project, I use the environment form Udacity. The links to modules at different system environments are copied here for convenience:

  • Linux: click here
  • Mac OSX: click here
  • Windows (32-bit): click here
  • Windows (64-bit): click here I conducted my experiments in Ubuntu 16.04, so I picked the 1st option. Then, extract and place the Reacher_Linux folder within the project root. The project folder structure now looks like this (Program generated .png and model files are excluded):
Project Root
     |-install_requirements (Folder)
     |-README.md
     |-Report.md
     |-agent.py
     |-models.py
     |-train.py
     |-test.py
     |-utils.py
     |-Reacher_Linux (Folder)
            |-Reacher.x86_64
            |-Reacher.x86
            |-Reacher_Data (Folder)

Instructions to the Program


Step 1: Training

python thain.py

After training, the following files will be generated and placed in the project root folder:

  • best_model.checkpoint (the trained model)
  • training_100avgscore_plot.png (a plot of avg. scores during training)
  • training_score_plot.png (a plot of per-episode scores during training)
  • unity-environment.log (log file created by Unity)

Step 2: Test

python test.py

The testing performance will be summarized in the generated plot within project root:

  • test_score_plot.png

a2c's People

Contributors

qiaochen avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.