Giter Site home page Giter Site logo

danielpalaio / pong-v4_deeprl Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 2.0 51.52 MB

Atari OpenAI Pong-v4 DeepRL-based solutions (DQN, DuelingDQN, D3QN)

License: MIT License

Python 100.00%
dqn deeprl openai-pong duelingdqn tensorflow keras openai-gym openai d3qn dqn-tensorflow

pong-v4_deeprl's Introduction

OpenAI Pong-v4 DeepRL-based solutions

Investigation under the development of the master thesis "DeepRL-based Motion Planning for Indoor Mobile Robot Navigation" @ Institute of Systems and Robotics - University of Coimbra (ISR-UC)

Software/Requirements

Module Software/Hardware
Python IDE Pycharm
Deep Learning library Tensorflow + Keras
GPU GeForce GeForce GTX 1060
Interpreter Python 3.8
Packages requirements.txt

To setup Pycharm + Anaconda + GPU, consult the setup file here.
To import the required packages (requirements.txt), download the file into the project folder and type the following instruction in the project environment terminal:

pip install -r requirements.txt

⚠️ WARNING ⚠️

The training process generates a .txt file that track the network models (in 'tf' and .h5 formats) which achieved the solved requirement of the environment. Additionally, an overview image (graph) of the training procedure is created.
To perform several training procedures, the .txt, .png, and directory names must be change. Otherwise, the information of previous training models will get overwritten, and therefore lost.

Regarding testing the saved network models, if using the .h5 model, a 5 episode training is required to initialize/build the keras.model network. Thus, the warnings above mentioned are also appliable to this situation.
Loading the saved model in 'tf' is the recommended option. After finishing the testing, an overview image (graph) of the training procedure is also generated.

OpenAI Atari Pong-v4

Actions:
0 - No action
1 - No action
2 - Racket go up
3 - Racket go down
4 - Racket go up
5 - Racket go down
Actions (2 & 4) and (3 & 5) - Same movement with different amplitudes

States:
Stack of 4 (80, 80) cropped grey-scaled images (6400 pixels)

Rewards:
Scalar value (1) for a winning rally
Scalar value (-1) for a losing rally

Episode termination:
Player reaches a score of 21
Episode length > 400000

Solved Requirement:
Average score of 17 over 100 consecutive trials

Deep Q-Network (DQN)

Train Test
Parameter Value
Number of episodes 400
Learning rate 0.0001
Discount Factor 0.99
Epsilon 1.0
Batch size 32
TargetNet update rate (steps) 1000
Actions 6
States (4, 80, 80)
Parameter Value
Number of episodes 100
Epsilon 0.01
Actions 6
States (4, 80, 80)

Network model used for testing: 'saved_networks/dqn_model10' ('tf' model, also available in .h5)

Dueling DQN

Train Test
Parameter Value
Number of episodes 300
Learning rate 0.0001
Discount Factor 0.99
Epsilon 1.0
Batch size 32
TargetNet update rate (steps) 1000
Actions 6
States (4, 80, 80)
Parameter Value
Number of episodes 100
Epsilon 0.01
Actions 6
States (4, 80, 80)

Network model used for testing: 'saved_networks/duelingdqn_model30' ('tf' model, also available in .h5)

Dueling Double DQN (D3QN)

Train Test
Parameter Value
Number of episodes 400
Learning rate 0.0001
Discount Factor 0.99
Epsilon 1.0
Batch size 32
TargetNet update rate (steps) 1000
Actions 6
States (4, 80, 80)
Parameter Value
Number of episodes 100
Epsilon 0.01
Actions 6
States (4, 80, 80)

Network model used for testing: 'saved_networks/d3qn_model50' ('tf' model, also available in .h5)

pong-v4_deeprl's People

Contributors

danielpalaio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

jonarod khiemphi

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.