Giter Site home page Giter Site logo

chess-alpha-zero's Introduction

About

Chess reinforcement learning by AlphaZero methods.

This project is based on the following resources:

  1. DeepMind's Oct. 19th publication: Mastering the Game of Go without Human Knowledge
  2. DeepMind's recent arxiv paper Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
  3. The great Reversi development of the DeepMind ideas that @mokemokechicken did in his repo: https://github.com/mokemokechicken/reversi-alpha-zero

Note: This project is still under construction!!

Environment

  • Python 3.6.3
  • tensorflow-gpu: 1.3.0
  • Keras: 2.0.8

Modules

Reinforcement Learning

This AlphaZero implementation consists of two workers, self and opt.

  • self plays the newest model against itself to generate self-play data for use in training.
  • opt trains the existing model to create further new models, using the most recent self-play data.

Evaluation

Evaluation options are provided by eval and gui.

  • eval automatically tests the newest model by playing it against an older model (whose age can be specified).
  • gui allows you to personally play against the newest model.

Data

  • data/model/model_*: newest model.
  • data/model/old_models/*: archived old models.
  • data/play_data/play_*.json: generated training data.
  • logs/main.log: log file.

If you want to train a model from scratch, delete the above directories.

How to use

Setup

install libraries

pip install -r requirements.txt

If you want use GPU,

pip install tensorflow-gpu

set environment variables

Create .env file and write this.

KERAS_BACKEND=tensorflow

Basic Usage

To train a model or further train an existing model, execute Self-Play and Trainer.

Self-Play

python src/chess_zero/run.py self

When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.

options

  • --new: create new newest model from scratch
  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)
  • --type small: use small config for commodity hardware, (see src/chess_zero/configs/small.py)

Trainer

python src/chess_zero/run.py opt

When executed, Training will start. A base model will be loaded from latest saved next-generation model. If not existed, BestModel is used. Trained model will be saved every 2000 steps(mini-batch) after epoch.

options

  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)
  • --type small: use small config for commodity hardware, (see src/chess_zero/configs/small.py)
  • --total-step: specify an artificially nonzero starting point for total steps (mini-batches)

Evaluator

python src/chess_zero/run.py eval

When executed, Evaluation will start. It evaluates BestModel and the latest next-generation model by playing about 200 games. If next-generation model wins, it becomes BestModel.

options

  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)
  • --type small: use small config for commodity hardware, (see src/chess_zero/configs/small.py)

Play Game

python src/chess_zero/run.py gui

When executed, ordinary chess board will be displayed in unicode and you can play against the newest model.

options

  • --type mini: use mini config for testing, (see src/chess_zero/configs/mini.py)
  • --type small: use small config for commodity hardware, (see src/chess_zero/configs/small.py)

Tips and Memos

GPU Memory

Usually the lack of memory cause warnings, not error. If error happens, try to change per_process_gpu_memory_fraction in src/worker/{evaluate.py,optimize.py,self_play.py},

tf_util.set_session_config(per_process_gpu_memory_fraction=0.2)

Less batch_size will reduce memory usage of opt. Try to change TrainerConfig#batch_size in NormalConfig.

Tablebases

This implementation supports using the Gaviota tablebases for endgame evaluation. The tablebase files should be placed into the directory chess-alpha-zero/tablebases. The Gaviota bases can be generated from scratch (see the repository), or downloaded directly via torrent (see "Gaviota" on the Olympus Tracker).

chess-alpha-zero's People

Contributors

benediamond avatar samuelstarshot avatar yhyu13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.