Giter Site home page Giter Site logo

quantumiracle / pytorch-nfsp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from younggyoseo/pytorch-nfsp

0.0 1.0 0.0 1.53 GB

Implementation of Deep Reinforcement Learning from Self-Play in Imperfect-Information Games (Heinrich and Silver, 2016)

Python 14.61% Shell 0.21% Jupyter Notebook 85.17%

pytorch-nfsp's Introduction

Command Instruction

Single Agent against Baseline

For training DQN against the environment baseline:

  1. single process version code:

    python train_dqn_against_baseline.py --env SlimeVolley-v0 --hidden-dim 256 --max-tag-interval 3000

    python train_dqn_against_baseline.py --env Pong-ram-v0 --hidden-dim 32 --max-tag-interval 10000

    Note:

    • For SlimeVolley env, use SlimeVolley-v0 for RAM control and SlimeVolleyNoFrameskip-v0 for image-based (3*168*84) control; and it requires the hidden-dim to be 256 to learn effective models; the maximal episode length of SlimeVolley env is about 3000, so --max-tag-interval needs to be at least 3000.
    • For Pong env (OpenAI Gym Atari Pong), use Pong-ram-v0 for RAM control and Pong-v0 for image-based control; the default hidden-dim 32 can solve the RAM version within an hour; the episode length is usually within 10000, so can use --max-tag-interval 10000.
  2. multi-process version code (with vectorized environments):

python train_dqn_against_baseline_mp.py --env SlimeVolley-v0 --hidden-dim 256 --num-envs 5 --max-tag-interval 3000

python train_dqn_against_baseline_mp.py --env Pong-ram-v0 --hidden-dim 256 --num-envs 2 --max-tag-interval 10000

Two Agents Nash DQN

For two agents zero-sum game with Nash DQN:

Test with rps_v1 (gamma is set 0 b.c. it is a repeated stage game):

python nash_dqn.py --env rps_v1 --num-envs 2 --hidden-dim 64 --evaluation-interval 500 --rl-start 1000 --lr 0.0001 --gamma 0

python nash_dqn.py --env SlimeVolley-v0 --hidden-dim 256 --num_envs 5 --max-tag-interval 3000

python nash_dqn.py --env pong_v1 --ram --hidden-dim 32 --num_envs 2 --max-tag-interval 10000

Note:

  • pong_v1 is the Pong game from PettingZoo for two agents, need to specify --ram for RAM control, otherwise it is image-based control.

Two Agents Neural Fictitious Self-Play (NFSP)

python main.py --env SlimeVolley-v0 --hidden-dim 256 --max-frames 20000000 --max-tag-interval 3000

python main.py --env SlimeVolleyNoFrameskip-v0 --hidden-dim 512 --max-frames 30000000 --max-tag-interval 3000

python main.py --env pong_v1 --ram --max-frames 20000000 --max-tag-interval 10000

pytorch-nfsp's People

Contributors

quantumiracle avatar younggyoseo avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.