Giter Site home page Giter Site logo

tstarbot1's Introduction

SC2Learner (TStarBot1) - Macro-Action-Based StarCraft-II Reinforcement Learning Environment

SC2Learner is a macro-action-based StarCraft-II reinforcement learning research platform. It exposes the re-designed StarCraft-II action space, which has more than one hundred discrete macro actions, based on the raw APIs exposed by DeepMind and Blizzard's PySC2. The macro action space relieves the learning algorithms from a disastrous burden of directly handling a massive number of atomic keyboard and mouse operations, making learning more tractable. The environments and wrappers strictly follow the interface of OpenAI Gym, making it easier to be adapted to many off-the-shelf reinforcement learning algorithms and implementations.

TStartBot1, a reinforcement learning agent, is also released with two off-the-shelf reinforcement learning algorithms Dueling Double Deep Q Network (DDQN) and Proximal Policy Optimization (PPO), as examples. Distributed versions of both algorithms are released, enabling learners to scale up the rollout experience collection across thousands of CPU cores on a cluster of machines. TStarBot1 is able to beat level-9 built-in AI (cheating resources) with 97% win-rate and level-10 (cheating insane) with 81% win-rate.

A whitepaper of TStarBots is available at here.

Table of Contents

Installations

Prerequisites

Setup

Git clone this repository and then install it with

pip3 install -e sc2learner

Getting Started

Run Random Agent

Run a random agent playing against a builtin AI of difficulty level 1.

python3 -m sc2learner.bin.evaluate --agent random --difficulty '1'

Train PPO Agent

To train an agent with PPO algorithm, actor workers and learner worker must be started respectively. They can run either locally or across separate machines (e.g. actors usually run in a CPU cluster consisting of hundreds of machines with tens of thousands of CPU cores, and a learner runs in a GPU machine). With the designated ports and learner's IP, rollout trajectories and model parameters are communicated between actors and learner.

  • Start 48 actor workers (run the same script in all actor machines)
for i in $(seq 0 47); do
  CUDA_VISIBLE_DEVICES= python3 -m sc2learner.bin.train_ppo --job_name=actor --learner_ip localhost &
done;
  • Start a learner worker
CUDA_VISIBLE_DEVICES=0 python3 -m sc2learner.bin.train_ppo --job_name learner

Similarly, DQN algorithm can be tried with sc2learner.bin.train_dqn.

Evaluate PPO Agent

After training, the agent's in-game performance can be observed by letting it play the game against a build-in AI of a certain difficulty level. Win-rate is also estimated meanwhile with multiple such games initialized with different game seeds.

python3 -m sc2learner.bin.evaluate --agent ppo --difficulty 1 --model_path REPLACE_WITH_YOUR_OWN_MODLE_PATH

Play vs. PPO Agent

We can also try ourselves playing against the learned agent by first starting a human player client and then a learned agent. They can run either locally or remotely. When run across two machines, --remote argument needs to be set for the human player side to create an SSH tunnel to the remote agent's machine and ssh keys must be used for authentication.

  • Start a human player client
CUDA_VISIBLE_DEVICES= python3 -m pysc2.bin.play_vs_agent --human --map AbyssalReef --user_race zerg
  • Start a PPO agent
python3 -m sc2learner.bin.play_vs_ppo_agent --model_path REPLACE_WITH_YOUR_OWN_MODLE_PATH

Train via Self-play

Besides, a self-play training (playing vs. past versions) is also provided to make learning more diversified strategies possible.

  • Start Actors
for i in $(seq 0 48); do
  CUDA_VISIBLE_DEVICES= python3 -m sc2learner.bin.train_ppo_selfplay --job_name=actor --learner_ip localhost &
done;
  • Start Learner
CUDA_VISIBLE_DEVICES=0 python3 -m sc2learner.bin.train_ppo_selfplay --job_name learner

Environments and Wrappers

The environments and wrappers strictly follow the interface of OpenAI Gym. The macro action space is defined in ZergActionWrapper and the observation space defined in ZergObservationWrapper, based on which users can easily make their own changes and restart the training to see what happens.

Questions and Help

You are welcome to submit questions and bug reports in Github Issues. You are also welcome to contribute to this project.

tstarbot1's People

Contributors

xinghai-sun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tstarbot1's Issues

Pretrained model

Is it possible for you guys to provide a pretrained model for TStarBot1?

Thank you.

Platform question

Hi ! Xinhai-sun, thank u for sharing this amazing project, could u pls tell us the platform for the porject as well? Such as the OS(win or ubuntu or mac) and the game version( is it based on 4.8.2 or other SC2 version? ).

run nothing at learner mode

Hello,I get a problem,when i run the learner mode in cmd,the game start ,but it can not run at all,it get stuck,the time is still at 0:00.what is wrong ?help,thank you

Torch install error

After I have installed PySC2extension, I git clone the TStarBot1 and install it ,I got the issue as follows:
Could not find a version that satisfies the requirement torch==0.4.1 (from versions: 0.1.2, 0.1.2.post1)No matching distribution found for torch==0.4.1

I have searched the question by google,some guys say it may be caused by the operating System,is it right? My operating system is windows10,does the project not support windows?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.