jangirrishabh / overcoming-exploration-from-demos Goto Github PK

Implementation of the paper "Overcoming Exploration in Reinforcement Learning with Demonstrations" Nair et al. over the HER baselines from OpenAI

License: MIT License

Python 100.00%

reinforcement-learning reinforcement-learning-agent learning-from-demonstration robotics hindsight-experience-replay gazebo ros openai-gym ddpg-algorithm actor-critic

overcoming-exploration-from-demos's People

Contributors

Stargazers

Watchers

overcoming-exploration-from-demos's Issues

Error with demos

Can you send your files with demonstrations? And please explain how to debug this:

File "/home/khurram/DL_skoltech/Research_hand/HER_overcoming_exploration/experiment/ddpg.py", line 155, in initDemoBuffer
demoData = np.load(demoDataFile)
File "/home/khurram/.local/lib/python3.6/site-packages/numpy/lib/npyio.py", line 422, in load
fid = open(os_fspath(file), "rb")
IsADirectoryError: [Errno 21] Is a directory: '/home/khurram/DL_skoltech/'

the number of CPU cores to use

When I set the --num_cpu > 1, I have to wait for a long time at this line:

2018-12-26 11:20:50.480133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-12-26 11:20:50.480139: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-12-26 11:20:50.480292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6899 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:03:00.0, compute capability: 6.1)
Training...

It seems like the program doesn't enter the training phase.
Is there any especially setting to use cpu more than 1 ?

Learning Time

How long do you train the barret warm for pick and place action using this method?
the gym-gazebo is quite slow for me to implement the learning algorithm

learning without demo

how to learn without demonstration?

ValueError: Object arrays cannot be loaded when allow_pickle=False

Hello, I am trying to train a robot pick and place using HER + DDPG and BC loss.

I have modified her.config.py to use BC loss and Q-Filter. The number of demonstrations set is 50, as the ones I recorded.

I recorded demonstrations using a python script and saved the file with:

fileName = "pick_rail_norandom"
fileName += ".npz"
np.savez_compressed(fileName, acs=actions, obs=observations, info=infos)

I know the file is ok because it is the same of another work that used baselines with no problems at all. I also checked that the agent performed well while recording, looking at the environment and trajectories recorded.

When I try to start training with MPI, specifically this command:

mpirun -n 4 python3 -m baselines.run --alg=her --env=dVRLPick-v0 --num_timesteps=1e6 --demo_file=/home/neri/Desktop/RL4dVRK/dVRL_simulator/record_demonstration_dVRL/pick_rail_norandom.npz --save_path=~/models/pickrail_norandom --log_path=~/logs/pickrail_norandom

the training exits with:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/run.py", line 255, in <module>
    main(sys.argv)
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/run.py", line 221, in main
    model, env = train(args, extra_args)
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/run.py", line 85, in train
    **alg_kwargs
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/her/her.py", line 177, in learn
    policy_save_interval=policy_save_interval, demo_file=demo_file)
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/her/her.py", line 35, in train
    if policy.bc_loss == 1: policy.init_demo_buffer(demo_file) #initialize demo buffer if training with demonstrations
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/her/ddpg.py", line 166, in init_demo_buffer
    demo_data_obs = demoData['obs']
  File "/home/neri/.local/lib/python3.6/site-packages/numpy/lib/npyio.py", line 255, in __getitem__
    pickle_kwargs=self.pickle_kwargs)
  File "/home/neri/.local/lib/python3.6/site-packages/numpy/lib/format.py", line 727, in read_array
    raise ValueError("Object arrays cannot be loaded when "
ValueError: Object arrays cannot be loaded when allow_pickle=False
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[59978,1],0]
  Exit code:    1
--------------------------------------------------------------------------

Any advice would be highly appreciated.

Also: is there a difference in setting num_env > 1 or using MPI with some number of processes specified?

Best regards

ValueError: not allowed to raise maximum limit

when i run the following command:
python experiment/train.py,

there is an following error:
File "experiment/train.py", line 244, in main
launch(**kwargs)
File "experiment/train.py", line 132, in launch
resource.setrlimit(resource.RLIMIT_NOFILE, (65536, 65536))
ValueError: not allowed to raise maximum limit

is there any solution?

Environment for Barret WAM robotic arm simulation in Gazebo

Did you public environment for Barret WAM robotic arm simulation in Gazebo in multi-block stacking task ?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.