Giter Site home page Giter Site logo

huangwl18 / modular-rl Goto Github PK

View Code? Open in Web Editor NEW
205.0 11.0 32.0 4.36 MB

[ICML 2020] PyTorch Code for "One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control"

Home Page: https://huangwl18.github.io/modular-rl/

License: Other

Python 13.64% Jupyter Notebook 86.36%
deep-learning reinforcement-learning modularity graph-neural-networks locomotion generalization modular-control decentralized-control message-passing emergent-communication

modular-rl's People

Contributors

dependabot[bot] avatar huangwl18 avatar pathak22 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

modular-rl's Issues

The program can't use SubprocVecEnv

SubprocVecEnv uses multiprocessing which means utils.makeEnvWrapper will fail in

def helper():
e = gym.make("%s-v0" % env_name)
e.seed(seed)
return wrappers.ModularEnvWrapper(e, obs_max_len)

gym.make can't find the registered env. Because registry in gym.envs.registration will initialize again in subprocess.

List of training/test environments

Hi @huangwl18, @pathak22! Thanks for the code release.

The paper does not specify which environments were used for training and for zero-shot evaluation. For instance, Humanoid++ has 8 environments out of which the two were used for zero-shot evaluation. Can you tell me which?

Can you, please, provide the full list for all of the environments used in the paper?

Incorrect assignment of limb_type_vec in environment modules

Hello Wenlong,
Thanks for sharing your interesting work!

I have run some experiments with your code, but I think there are some typo in _get_obs_per_limb() in environment .py files.
Specifically, for humanoid++, the limb_type_vec is assigned incorrectly, i.e. every limbs are assigned to (0, 0, 0, 0).

In humanoid xmls, the name of limb body belongs to {'torso', '(left/right) shoulder', '(left/right) thigh', '(left/right) shin', '(left/right) upper arm', '(left/right) lower arm'}, but the limb type assignment condition compares the name with {'hip', 'knee', 'shoulder', 'elbow'} which are the names of motor (joint) in your code.

I'm wondering if this is intended and would like to hear from you if this makes some difference in model performance.

Thank you

code crashed after reaching maxstep=20k

Very interesting work! Thanks for sharing the code.

I run into an issue when setting the max_timesteps=20000,

To reproduce it:

python main.py --expID 002 --td --bu --morphologies walker_7_main --max_timesteps 20000

It looks the training is finished, but an error was produced at the end:

ExpID: 2, FPS: 5.03, TotalT: 19902, EpisodeNum: 157, SampleNum: 20059, ReplayBSize: 20059
walker_7_main === EpisodeT: 98, Reward: 232.93
*** training finished and model saved to ./results/EXP_0002/model.pyth ***
Process Process-1:
Traceback (most recent call last):
  File "/opt/anaconda3/envs/modular-rl/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/opt/anaconda3/envs/modular-rl/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/anaconda3/envs/modular-rl/lib/python3.6/site-packages/baselines/common/vec_env/subproc_vec_env.py", line 10, in worker
    cmd, data = remote.recv()
  File "/opt/anaconda3/envs/modular-rl/lib/python3.6/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/opt/anaconda3/envs/modular-rl/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/opt/anaconda3/envs/modular-rl/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError

Not able to run

Hello,
when I try to run your code with the default arguments:
python main.py --expID 004 --td --bu --morphologies hopper

I get the following error:
raise error.NameNotFound(message) gym.error.NameNotFound: Environment 'environments:hopper_3' doesn't exist.
Do you know what could cause this issue?

Multi-CPU parallel training

Hi, Wenlong. Thanks for sharing your code.

When I reproduced your code, I found that only one CPU was used for training. The training speed was a bit slow. Can your code train with multiple CPUs in parallel? I can't find the corresponding option in your code configuration.

Why xpos[0] -= torso_x_pos

Dear author,

Thanks for sharing code.
My question here is that:
Why do we have xpos[0] -= torso_x_pos that only modifies the x-axis of a body's position, but not y- and/or z- ?
Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.