Giter Site home page Giter Site logo

wnnlyf / hanabi_sad Goto Github PK

View Code? Open in Web Editor NEW

This project forked from facebookresearch/hanabi_sad

0.0 0.0 0.0 47 KB

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning

License: Other

CMake 1.27% C++ 52.71% Python 43.18% Shell 2.84%

hanabi_sad's Introduction

Simplified Action Decoder in Hanabi

This repo contains code and models for Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning.

To reference this work, please use:

@misc{hu2019simplified,
    title={Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning},
    author={Hengyuan Hu and Jakob N Foerster},
    year={2019},
    eprint={1912.02288},
    archivePrefix={arXiv},
    primaryClass={cs.AI}
}

Compile

Prerequisite

Install cudnn7, cuda9.2 and gcc7. This might be platform dependent. Other versions might also work but we have only tested with the above versions. Note that we discovered a deadlock problem when using tensors with C++ multi-threading when using cuda10.0 on Pascal GPU.

Build PyTorch from Source

Create a fresh conda env & compile PyTorch from source. If PyTorch and this repo are compiled by compilers with different ABI compatibility, mysterious bugs that unexpectedly corrupt memory may occur. To avoid that, the current solution is to compile & install PyTorch from source first and then compile this repo against that PyTorch binary. For convenience, we paste instructions of compling PyTorch here.

# create a fresh conda environment with python3
conda create --name [your env name] python=3.7
conda activate [your env name]

conda install numpy pyyaml mkl mkl-include setuptools cmake cffi typing
conda install -c pytorch magma-cuda92

# clone pytorch
git clone -b v1.3.0 --recursive https://github.com/pytorch/pytorch
cd pytorch

export CMAKE_PREFIX_PATH=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
# set cuda arch list so that the built binary can be run on both pascal and volta
TORCH_CUDA_ARCH_LIST="6.0;7.0" python setup.py install

Additional dependencies

pip install tensorboardX
pip install psutil

# if the current cmake version is < 3.15
conda install -c conda-forge cmake

Clone & Build this repo

For convenience, add the following lines to your .bashrc, after the line of conda activate xxx.

# set path
CONDA_PREFIX=${CONDA_PREFIX:-"$(dirname $(which conda))/../"}
export CPATH=${CONDA_PREFIX}/include:${CPATH}
export LIBRARY_PATH=${CONDA_PREFIX}/lib:${LIBRARY_PATH}
export LD_LIBRARY_PATH=${CONDA_PREFIX}/lib:${LD_LIBRARY_PATH}

# avoid tensor operation using all cpu cores
export OMP_NUM_THREADS=1

Clone & build.

git clone --recursive https://github.com/facebookresearch/hanabi.git

cd hanabi
mkdir build
cd build
# next line may produce an error 'Target "torch" not found.',
# which can be ignored
cmake ..
make -j10

Run

hanabi/pyhanabi/tools contains some example scripts to launch training runs. dev.sh requires 2 gpus to run, 1 for training, 1 for simulation while the rest require 3 gpus, 1 for training, 2 for simulation.

cd pyhanabi
sh tools/dev.sh

Trained Models

Run the following command to download the trained models used to produce tables in the paper.

cd model
sh download.sh

To evaluate a model, simply run

cd pyhanabi
python tools/eval_model.py --weight ../models/sad_2p_10.pthw --num_player 2

Related Repos

The results on Hanabi can be further improved by running search on top of our agents. Please refer to the paper and code for details.

We also open-sourced a single agent implementation of R2D2 tested on Atari here.

Contribute

Python

Use black to format python code, run black *.py before pushing

C++

The root contains a .clang-format file that define the coding style of this repo, run the following command before submitting PR or push

clang-format -i *.h
clang-format -i *.cc

Copyright

Copyright (c) Facebook, Inc. and its affiliates. All rights reserved.

This source code is licensed under the license found in the LICENSE file in the root directory of this source tree.

hanabi_sad's People

Contributors

hengyuan-hu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.