Giter Site home page Giter Site logo

lijunsun90 / matrixworld Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 1.0 974.72 MB

MatrixWorld: A pursuit-evasion platform for safe multi-agent coordination and autocurricula

Home Page: https://arxiv.org/pdf/2307.14854.pdf

Python 100.00%
adversarial-learning arms-race co-evolution grid-world multi-agent-environment multi-agent-path-finding pursuit-evasion safety-critical autocurricula collision-resolution safe-multiagent-reinforcement-learning large-scale

matrixworld's Introduction

Official code for the paper "MatrixWorld: A pursuit-evasion platform for safe multi-agent coordination and autocurricula", which is preprinted in Arxiv and under review.

More documents will be updated continuously.

Description

MatrixWorld is

  • a safety constrained pursuit-evasion platform for safe multi-agent coordination,
  • a lightweight co-evolution environment for autocurricula research.

In this work,

  • the safety is defined in terms of multi-agent collision avoidance. It covers diverse safety definitions in the real-world applications.
  • 9 pursuit-evasion game variants are defined for example scenarios like real-world drone and vehicle swarm, multi-agent path finding (MAPF), popular pursuit-evasion setups, and classic cops-and-robbers problem.

It can be used for the research of

  • safe multi-agent environment implementation,
  • safe multi-agent reinforcement learnng (MARL),
  • safe multi-agent coordination,
  • co-evolution, autocurricula, self-play, arm races, or adversarial learning.

Task definition

  • Nine pursuit-evasion variants are defined for example scenarios like (1) real-world drone and vehicle swarm, (2) multi-agent path finding (MAPF), (3) popular pursuit-evasion setups, and (4) classic cops-androbbers problem.
  • More pursuit-evasion variants (other tasks) can be designed based on different practical meanings of safety.

Alt Text

Environmental parameters

  • Origin of the grid world: Top left corner.
  • Size of the grid world: Tunable. Default value: 20 x 20.
  • Swarm size of agents (pursuers and evaders): Tunable. Default value: n_evaders=3, n_pursuers=12.
  • Observation: Tunable. Binary matrix of size fov_scope x fov_scope x 3, which is a square centered at the agent with 3 channels: local_evader, local_pursuers, local_obstacles. Default value is: 11 x 11 x 3.
  • Action: Vector of size 5, where 0 ~ 4 represent keeping still, moving north, moving east, moving south, and moving west.

Remark: The codes provide utility functions for matrix-based and vector-based global and local observations.

Safety-constrained multi-agent action execution model

The proposed safety-constrained multi-agent action execution model is general for the software implementation of safe multi-agent environments.

It consists two parts: (1) multi-agent-environment interaction model; (2) safety-constrained collision resolution mechanism for the simultaneous action execution of multiple agents.

(1) Multi-agent-environment interaction model

Multi-agent-environment interaction model in adversarial multi-agent settings, e.g., pursuit-evasion games.

Alt Text

(2) Safety-constrained collision resolution mechanism

The collision resolution mechanism is defined for the simultaneous action execution of agents, which consists of 3 collisions types and 3 collision outcomes for each type, based on the safety definitions in real-world applications and literature conventions.

Remark: The collision resolution mechanism also determines which agent should be responsible, which is useful for the correct learning of algorithms.

Alt Text Alt Text

Lightweight co-evolution platform

  • MatrixWorld is a lightweight co-evolution platform to test autocurricula research ideas.
  • Our experiments achieve the autocurricula between pursuers and evaders by adversarial learning.
  • Our experiments show that the passive (evasive) policy learning benefits more from co-evolution than active (pursuing) policy learning in an asymmetric adversarial game.

Figure: (left) evasive behavior trained by normal reinforcement learning; (middle) evasive behavior trained by adversarial learning; (right) arms race in the learning process of pursuers and evaders.

Alt Text Alt Text Alt Text

Paper citation

Cite the following paper if you use this environment, code, or found it useful.

@article{sun2023matrixworld,
  title={MatrixWorld: A pursuit-evasion platform for safe multi-agent coordination and autocurricula},
  author={Sun, Lijun and Chang, Yu-Cheng and Lyu, Chao and Lin, Chin-Teng and Shi, Yuhui},
  journal={arXiv preprint arXiv:2307.14854},
  year={2023}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.