Giter Site home page Giter Site logo

taochenshh / hcp Goto Github PK

View Code? Open in Web Editor NEW
17.0 4.0 8.0 1.22 MB

(NeurIPS 2018) Hardware Conditioned Policies for Multi-Robot Transfer Learning

Python 98.85% Shell 1.15%
hardware-conditioned-policies multi-robot-transfer reinforcement-learning robot-learning hcp

hcp's Introduction

Hardware Conditioned Policies for Multi-Robot Transfer Learning

Tao Chen, Adithya Murali, Abhinav Gupta

The Robotics Institute, Carnegie Mellon University

This is a pytorch-based implementation for our NeurIPS 2018 paper on hardware conditioned policies. The idea is that the policy input(state) is augmented with a hardware-specific encoding vector for better multi-robot skill transfer. The encoding vector can be either explicitly constructed (HCP-E) or learned implicitly via back-propagation (HCP-I). It's compatible with most of the existing deep reinforcement learning algorithms. We demonstrate the usage of our idea with DDPG+HER and PPO. If you find this work useful in your research, please cite:

@inproceedings{chen2018hardware,
  title={Hardware Conditioned Policies for Multi-Robot Transfer Learning},
  author={Chen, Tao and Murali, Adithyavairavan and Gupta, Abhinav},
  booktitle={Advances in Neural Information Processing Systems},
  pages={9355--9366},
  year={2018}
}

The code has been tested on Ubuntu 16.04.

Installation

  1. Install Anaconda

  2. Download code repo:

cd ~
git clone https://github.com/taochenshh/hcp.git
cd hcp
  1. Create python environment
conda env create -f environment.yml
conda activate hcp
  1. Install MuJoCo and mujoco-py 1.50

HCP-E Usage

  1. Generate robot xml files
cd gen_robots
chmod +x gen_multi_dof_simrobot.sh
## generate both peg_insertion and reacher environments
./gen_multi_dof_simrobot.sh peg_insertion reacher
## generate peg_insertion environments only
./gen_multi_dof_simrobot.sh peg_insertion
## generate reacher environments only
./gen_multi_dof_simrobot.sh reacher
  1. Train the policy model
cd ../HCP-E

## HCP-E: peg_insertion
python main.py --env=peg_insertion --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/peg_insertion --save_dir=peg_data/HCP-E

## HCP-E: reacher
cd util
python gen_start_and_goal.py
cd ..
python main.py --env=reacher --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/reacher --save_dir=reacher_data/HCP-E
  1. Test the policy model
## HCP-E: peg_insertion
python main.py --env=peg_insertion --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/peg_insertion --save_dir=peg_data/HCP-E --test

## HCP-E: reacher
python main.py --env=reacher --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/reacher --save_dir=reacher_data/HCP-E --test

Add --render in the end if you want to visually test the policy.

HCP-I Usage

  1. Generate robot xml files
cd gen_robots
python gen_hoppers.py --robot_num=1000
  1. Train the policy model
cd ../HCP-I

python main.py --env=hopper --with_embed --robot_dir=../xml/gen_xmls/hopper --save_dir=hopper_data/HCP-I
  1. Test the policy model
python main.py --env=hopper --with_embed --robot_dir=../xml/gen_xmls/hopper --save_dir=hopper_data/HCP-I --test

Add --render in the end if you want to visually test the policy.

hcp's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.