Giter Site home page Giter Site logo

pysc2-rl-agent's Introduction

(D)RL Agent For PySC2 Environment

MoveToBeacon CollectMineralShards DefeatRoaches DefeatZerglingsAndBanelings FindAndDefeatZerglings CollectMineralsAndGas BuildMarines

Introduction

Aim of this project is two-fold:

a.) Reproduce baseline DeepMind results by implementing RL agent (A2C) with neural network model architecture as close as possible to what is described in [1]. This includes embedding categorical (spatial-)features into continuous space with 1x1 convolution and multi-head policy, supporting actions with variable arguments (both spatial and non-spatial).

b.) Improve the results and/or sample efficiency of the baseline solution. Either with alternative algorithms (such as PPO [2]), using reduced set of features (unified across all mini-games) or alternative approaches, such as HRL [3] or Auxiliary Tasks [4].

Results

Map This Agent DeepMind
MoveToBeacon 26.3 26
CollectMineralShards 102 103
FindAndDefeatZerglings 43 45
DefeatRoaches 126* 100
DefeatZerglingsAndBanelings 197* 62
CollectMineralsAndGas 3340 3978
BuildMarines 0.55 3

* Unstable result with high std.dev (40 for DefeatRoaches and 120 for DefeatZerglingsAndBanelings)

A video of the trained agent on all minigames can be seen here: https://youtu.be/QdeObwCCxFI

Running

  • To train an agent, execute python main.py --envs=1 --map=MoveToBeacon.
  • To resume training from last checkpoint, specify --restore flag
  • To run in inference mode, specify --test flag
  • To change number of rendered environments, specify --render= flag
  • To change state/action space, specify path to a json config with --cfg_path=. The configuration with reduced feature space used to achieve some of the results above is:
{
  "feats": {
    "screen": ["visibility_map", "player_relative", "unit_type", "selected", "unit_hit_points_ratio", "unit_density"],
    "minimap": ["visibility_map", "camera", "player_relative", "selected"],
    "non_spatial": ["player", "available_actions"]
  }
}

Requirements

Good GPU and CPU are recommended, especially for full state/action space.

Related Work

Authors of xhujoy/pysc2-agents and pekaalto/sc2aibot were the first to attempt replicating [1] and their implementations were used as a general inspiration during development of this project, however their aim was more towards replicating results than architecture, missing key aspects, such as full feature and action space support. Authors of simonmeister/pysc2-rl-agents also aim to replicate both results and architecture, though their final goals seem to be in another direction. Their policy implementation was used as a loose reference for this project.

References

[1] StarCraft II: A New Challenge for Reinforcement Learning
[2] Proximal Policy Optimization Algorithms
[3] Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
[4] Reinforcement Learning with Unsupervised Auxiliary Tasks

pysc2-rl-agent's People

Contributors

inoryy avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.