Giter Site home page Giter Site logo

taochenshh / hcp Goto Github PK

View Code? Open in Web Editor NEW
17.0 4.0 8.0 1.22 MB

(NeurIPS 2018) Hardware Conditioned Policies for Multi-Robot Transfer Learning

Python 98.85% Shell 1.15%
hardware-conditioned-policies multi-robot-transfer reinforcement-learning robot-learning hcp

hcp's Introduction

Hardware Conditioned Policies for Multi-Robot Transfer Learning

Tao Chen, Adithya Murali, Abhinav Gupta

The Robotics Institute, Carnegie Mellon University

This is a pytorch-based implementation for our NeurIPS 2018 paper on hardware conditioned policies. The idea is that the policy input(state) is augmented with a hardware-specific encoding vector for better multi-robot skill transfer. The encoding vector can be either explicitly constructed (HCP-E) or learned implicitly via back-propagation (HCP-I). It's compatible with most of the existing deep reinforcement learning algorithms. We demonstrate the usage of our idea with DDPG+HER and PPO. If you find this work useful in your research, please cite:

@inproceedings{chen2018hardware,
  title={Hardware Conditioned Policies for Multi-Robot Transfer Learning},
  author={Chen, Tao and Murali, Adithyavairavan and Gupta, Abhinav},
  booktitle={Advances in Neural Information Processing Systems},
  pages={9355--9366},
  year={2018}
}

The code has been tested on Ubuntu 16.04.

Installation

  1. Install Anaconda

  2. Download code repo:

cd ~
git clone https://github.com/taochenshh/hcp.git
cd hcp
  1. Create python environment
conda env create -f environment.yml
conda activate hcp
  1. Install MuJoCo and mujoco-py 1.50

HCP-E Usage

  1. Generate robot xml files
cd gen_robots
chmod +x gen_multi_dof_simrobot.sh
## generate both peg_insertion and reacher environments
./gen_multi_dof_simrobot.sh peg_insertion reacher
## generate peg_insertion environments only
./gen_multi_dof_simrobot.sh peg_insertion
## generate reacher environments only
./gen_multi_dof_simrobot.sh reacher
  1. Train the policy model
cd ../HCP-E

## HCP-E: peg_insertion
python main.py --env=peg_insertion --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/peg_insertion --save_dir=peg_data/HCP-E

## HCP-E: reacher
cd util
python gen_start_and_goal.py
cd ..
python main.py --env=reacher --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/reacher --save_dir=reacher_data/HCP-E
  1. Test the policy model
## HCP-E: peg_insertion
python main.py --env=peg_insertion --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/peg_insertion --save_dir=peg_data/HCP-E --test

## HCP-E: reacher
python main.py --env=reacher --with_kin --train_ratio=0.9 --save_interval=200 --robot_dir=../xml/gen_xmls/simrobot/reacher --save_dir=reacher_data/HCP-E --test

Add --render in the end if you want to visually test the policy.

HCP-I Usage

  1. Generate robot xml files
cd gen_robots
python gen_hoppers.py --robot_num=1000
  1. Train the policy model
cd ../HCP-I

python main.py --env=hopper --with_embed --robot_dir=../xml/gen_xmls/hopper --save_dir=hopper_data/HCP-I
  1. Test the policy model
python main.py --env=hopper --with_embed --robot_dir=../xml/gen_xmls/hopper --save_dir=hopper_data/HCP-I --test

Add --render in the end if you want to visually test the policy.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.