Giter Site home page Giter Site logo

marl-comm's Introduction

Multi-agent Communication Algs

Implemented based on Tianshou and Pettingzoo.

Tutorials

Example Scripts

On PistonBall environment

  • test/test_mappo.py
  • test/test_qmix.py

Algs

Comm-free baselines

  • IDDPG
  • MADDPG
  • ISAC
  • MASAC
  • PPO
  • MAPPO
  • IQL
  • QMIX
  • QMIX-Attention

Comm baselines

Simple Taxonomy

flowchart TD
    subgraph CTDE [CTDE]
        subgraph p [Parameter]
            IP([Individual Parameter])
            PS([Parameter Sharing])
            IPGI([Individual Parameter with Global Information])
        end
        subgraph c [Critic]
            IC([Individual Critic])
            JC([Joint Critic])
        end
    end
    subgraph FD [Fully Decentralized]
    end
Loading

DataFlow in MARL-Comm

CL or DL

CTDE

We recommend A Survey of Multi-Agent Reinforcement Learning with Communication for a detailed taxonomy.

Training Schemes

Types Sub-types
Fully Decentralized
CTDE Individual Parameter
Parameter Sharing
Individual Parameter with Global Info

Logic of Tianshou in MARL

Tianshou API Overview

Tianshou API Overview

The figure refers to https://colab.research.google.com/drive/1MhzYXtUEfnRrlAVSB3SR83r0HA5wds2i?usp=sharing.

MARL Design Overview

flowchart
    subgraph mapolicy [MA-policy]
        p1((Agent1)) --action--> m([Manager]) -->|obs or messages| p1
        p2((Agent2)) --action--> m([Manager]) -->|obs or messages| p2
        p3((Agent3)) --action--> m([Manager]) -->|obs or messages| p3
    end    
    
    subgraph collector [Collector]
        VE(VecEnv) ==Transition==> mapolicy ==Action==> VE;
    end
    
    subgraph alg [Algorithm]
        collector ==Data==> B[(Buffer)] ==Sample==> T{Trainer} ==>|Processed Sample| mapolicy ==Info==> T
        T ==Info==> L{{Logger}}
    end
Loading

An algorithm corresponds to

  • A MA-policy: interaction among agents, such as the communication
  • A Buffer: what to store
  • A Trainer: Update, but implemented in each agent's policy actually

Base MA components

Buffer

To facilitate the sub-types of CTDE scheme, the buffer should consider the agent dimension.

  1. For the case using a single env, implement the MAReplayBuffer as a ReplayBufferManager with agent_num buffers.
  2. For the case using a vectorized env, implement the VectorMAReplayBuffer as a ReplayBufferManager with agent_num $*$ env_num buffers.

For both cases, the buffer should contain sub-buffers for shared global information, e.g., the state.

PettingZoo Env Wrapper

Act as a vectorized env with agent_num envs.

Collector

AysncCollector with self._ready_env_ids initialized as [1 0 ... ] and self.data initialized to be length env_num seems suitable, accompanied with env_id in returned info.

MAPolicyManager

Maintain the centralized part inside the class.

Instructions

Install

sudo apt install swig -y
pip install 'pettingzoo[all]'
# Install my version of tianshou
git clone https://github.com/Leo-xh/tianshou.git
cd tianshou
pip install -e .

marl-comm's People

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

tianyu-z

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.