This project trains agents to solve the Unity Tennis environment. To give you an idea of what the environment looks like there is a video of the two agents interacting in the environment below. Each racket is an agent and the goal is to keep the ball in play for as long as possible.
Reward: +0.1 for individual agent if it hits the ball over the net. -0.01 If an agent lets a ball hit the ground or hits the ball out of bounds.
Observation Space: consists of 3 stacks of 8 variables corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation.
Action Space: 2 continuous actions corresponding to movement toward (or away from) the net and jumping.
Goal: Get an average score of +0.5 (over 100 consecutive episode window, after taking the maximum score per episode over both agents). Specifically,
- After each episode, we add up the rewards that each agent received (without discounting), to get a score for each agent. This yields 2 (potentially different) scores. We then take the maximum of these 2 scores.
- This yields a single score for each episode.
-
Clone this Repository
-
Download the environment from one of the links below. You need only select the environment that matches your operating system:
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): Already included in the repo
(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.
(For AWS) If you'd like to train the agent on AWS (and have not enabled a virtual screen), then please use this link to obtain the "headless" version of the environment. You will not be able to watch the agent without enabling a virtual screen, but you will be able to train the agent. (To watch the agent, you should follow the instructions to enable a virtual screen, and then download the environment for the Linux operating system above.)
-
Unzip the downloaded file into the directory you cloned this repo into. Alternatively you can create your own project directory and place the files in there.
-
Create and activate a Pyton environment for this project. (I used PyCharm
venv
) -
Download all the dependencies.
- Install all the dependencies in the python folder
cd python pip install .
- Note: If installing the dependencies fails I included this file with all the project packages listed and the version I used. You can run
pip install -r requirements.txt
from the command line as long as you are in thie project home directory. - To get a newer version of PyTorch go here and follow the directions on the homepage.
- I recommend pip installing the newer version of tensorflow because the version required by Udacity appears to have a security flaw (as reported by Github).
- Install all the dependencies in the python folder
Follow along in Tennis.ipynb
to learn more about the environment, train/test an agent, and view the results of a trained agent.
If you just want to see a summary of the project check out report.md.
- checkpoints directory Directory of saved model parameters.
- images directory Directory of saved graphs and gifs.
- scores directory. Directory of saved scores from training.
- agents.py Python file that contains the MADDPG Class and DDPG Class
- networks.py Python file that contains the actor and critic networks.
- utils.py Python file that contains the Replay Buffer Class and OUNoise Class.
- Tennis_Windows_x86_64 directory This is a download of the Unity environment for 64-bit Windows System.
- python directory This is the install and setup directory from Udacity for this project.
- README file The README file you are currently reading.
- Report file This file contains an explanation of the project, the results of the project, and future ideas for the project.
- Soccor Notebook This notebook contains the initial setup for a more complex Unity environment than the Tennis environment.
- Tennis Notebook This notebook contains the walkthrough for solving the Unity Tennis environment.
- dependencies file This file contains a list of all the packages and their version numbers used in this project.