Giter Site home page Giter Site logo

mrshininnnnn / temporal-differences-learning Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 5.0 2.42 MB

"Learning to Predict by the Methods of Temporal Differences" by Sutton, Richard S. (1988)

Jupyter Notebook 93.76% Python 6.24%
reinforcement-learning reinforcement learning temporal-differences sutton richard

temporal-differences-learning's Introduction

Temporal Differences Learning

Introduction

The goal of this project is to reproduce the Figure 3, 4, 5 in Richard Sutton’s 1988 paper Learning to Predict by the Methods of Temporal Differences.

Original Figures

Original Figures

Reproduced Figures

Reproduced Figure 3 Reproduced Figure 4 Reproduced Figure 5

Directory

  • ./img - to save the output
  • main.py - to reproduce the experiments and generate figures directly
  • main.ipynb - to view the procudure step by step
Temporal-Differences-Learning/
├── README.md
├── img
├── main.ipynb
├── main.py
├── reference
└── requirements.txt

Dependencies

  • python >= 3.7.2
  • jupyter >= 1.0.0
  • numpy >= 1.16.2
  • matplotlib >= 3.1.1

Setup

Please ensure the following packages are already installed. A virtual environment is recommended.

  • Python (for .py)
  • Jupyter Notebook (for .ipynb)
$ cd Temporal-Differences-Learning/
$ pip3 install pip --upgrade
$ pip3 install -r requirements.txt

Run

To view the note book:

$ jupyter notebook

To run the script:

$ python3 main.py

To run the script using data with no duplicates:

$ python3 main.py --unique True

To run the script as you like:

$ python3 main.py --random_seed <int> --seq_len <int> --batch_num <int> --seq_num <int> --unique <bool>

Output

If everything goes well, you may see the similar results shown as below.

Start!
Reproduce Figure 3 in Sutton (1988)
Train of Lambda 0
Train of Lambda 0.1
Train of Lambda 0.3
Train of Lambda 0.5
Train of Lambda 0.7
Train of Lambda 0.9
Train of Lambda 1
Saving Figure 3 to img/figure_3.png

Reproduce Figure 4 in Sutton (1988)
Train of Lambda 0.0
Train of Lambda 0.3
Train of Lambda 0.8
Train of Lambda 1.0
Saving Figure 4 to img/figure_4.png

Reproduce Figure 4 in Sutton (1988)
Find Best Alpha for Each Lambda
Train of Lambda 0.0
Train of Lambda 0.1
Train of Lambda 0.2
Train of Lambda 0.3
Train of Lambda 0.4
Train of Lambda 0.5
Train of Lambda 0.6
Train of Lambda 0.7
Train of Lambda 0.8
Train of Lambda 0.9
Train of Lambda 1.0
Best Alpha 0.2 for Lambda 0.0
Best Alpha 0.2 for Lambda 0.1
Best Alpha 0.2 for Lambda 0.2
Best Alpha 0.2 for Lambda 0.3
Best Alpha 0.2 for Lambda 0.4
Best Alpha 0.15 for Lambda 0.5
Best Alpha 0.15 for Lambda 0.6
Best Alpha 0.15 for Lambda 0.7
Best Alpha 0.1 for Lambda 0.8
Best Alpha 0.1 for Lambda 0.9
Best Alpha 0.05 for Lambda 1.0
Re-Train Using Best Alpha for Each Lambda
Train of Lambda 0.0 Alpha 0.2
Train of Lambda 0.1 Alpha 0.2
Train of Lambda 0.2 Alpha 0.2
Train of Lambda 0.3 Alpha 0.2
Train of Lambda 0.4 Alpha 0.2
Train of Lambda 0.5 Alpha 0.15
Train of Lambda 0.6 Alpha 0.15
Train of Lambda 0.7 Alpha 0.15
Train of Lambda 0.8 Alpha 0.1
Train of Lambda 0.9 Alpha 0.1
Train of Lambda 1.0 Alpha 0.05
Saving Figure 5 to img/figure_5.png

Done!

Please find output under ./img.

Authors

Reference

  1. Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine learning, 3(1), 9-44.

temporal-differences-learning's People

Contributors

dependabot[bot] avatar mrshininnnnn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.