Giter Site home page Giter Site logo

quad-swarm-rl's Introduction

Decentralized Control of Quadrotor Swarms with End-to-end Deep Reinforcement Learning

A codebase for training reinforcement learning policies for quadrotor swarms. Includes:

Paper: https://arxiv.org/abs/2109.07735

Website: https://sites.google.com/view/swarm-rl

โ€ƒ

โ€ƒ

Installation

Initialize a Python environment, i.e. with conda (Python versions 3.6-3.8 are supported):

conda create -n swarm-rl python=3.8
conda activate swarm-rl

Clone and install this repo as an editable Pip package:

git clone https://github.com/Zhehui-Huang/quad-swarm-rl.git
cd quad-swarm-rl
pip install -e .

This should pull and install all the necessary dependencies including PyTorch.

Running experiments

Train

This will run the baseline experiment. Change the number of workers appropriately to match the number of logical CPU cores on your machine, but it is advised that the total number of simulated environments is close to that in the original command:

We provide a training script train.sh, so you can simply start training by command bash train.sh.

Or, even better, you can use the runner scripts in swarm_rl/runs/. These runner scripts (a Sample Factory feature) are Python files that contain experiment parameters, and support features such as evaluation on multiple seeds and gridsearches.

To execute a runner script run the following command:

python -m sample_factory.launcher.run --run=swarm_rl.runs.single_quad.single_quad --max_parallel=4 --pause_between=1 --experiments_per_gpu=1 --num_gpus=4

This command will start training four different seeds in parallel on a 4-GPU server. Adjust the parameters accordingly to match your hardware setup.

To monitor the experiments, go to the experiment folder, and run the following command:

tensorboard --logdir=./

WandB support

If you want to monitor training with WandB, follow the steps below:

  • add --with_wandb=True in the command.

Here is a total list of wandb settings:

--with_wandb: Enables Weights and Biases integration (default: False)
--wandb_user: WandB username (entity). Must be specified from command line! Also see https://docs.wandb.ai/quickstart#1.-set-up-wandb (default: None)
--wandb_project: WandB "Project" (default: sample_factory)
--wandb_group: WandB "Group" (to group your experiments). By default this is the name of the env. (default: None)
--wandb_job_type: WandB job type (default: SF)
--wandb_tags: [WANDB_TAGS [WANDB_TAGS ...]] Tags can help with finding experiments in WandB web console (default: [])

Test

To test the trained model, run the following command:

python -m swarm_rl.enjoy --algo=APPO --env=quadrotor_multi --replay_buffer_sample_prob=0 --quads_use_numba=False --train_dir=PATH_TO_TRAIN_DIR --experiment=EXPERIMENT_NAME --quads_view_mode CAMERA_VIEWS

EXPERIMENT_NAME and PATH_TO_TRAIN_DIR can be found in the cfg.json file of your trained model

CAMERA_VIEWS can be any number of views from the following: [topdown, global, chase, side, corner0, corner1, corner2, corner3, topdownfollow]

Unit Tests

To run unit tests:

./run_tests.sh

Citation

If you use this repository in your work or otherwise wish to cite it, please make reference to our CORL paper.

@inproceedings{batra21corl,
  author    = {Sumeet Batra and
               Zhehui Huang and
               Aleksei Petrenko and
               Tushar Kumar and
               Artem Molchanov and
               Gaurav S. Sukhatme},
  title     = {Decentralized Control of Quadrotor Swarms with End-to-end Deep Reinforcement Learning},
  booktitle = {5th Conference on Robot Learning, CoRL 2021, 8-11 November 2021, London, England, {UK}},
  series    = {Proceedings of Machine Learning Research},
  publisher = {{PMLR}},
  year      = {2021},
  url       = {https://arxiv.org/abs/2109.07735}
}

Github issues and pull requests are welcome.

quad-swarm-rl's People

Contributors

zhehui-huang avatar amolchanov86 avatar yang-zj1026 avatar sumeetbatra avatar rahul-krupani avatar alex-petrenko avatar tushartk avatar taochenosu avatar sunnysingh00 avatar andrewzhang505 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.