Giter Site home page Giter Site logo

jangirrishabh / overcoming-exploration-from-demos Goto Github PK

View Code? Open in Web Editor NEW
147.0 5.0 31.0 748 KB

Implementation of the paper "Overcoming Exploration in Reinforcement Learning with Demonstrations" Nair et al. over the HER baselines from OpenAI

License: MIT License

Python 100.00%
reinforcement-learning reinforcement-learning-agent learning-from-demonstration robotics hindsight-experience-replay gazebo ros openai-gym ddpg-algorithm actor-critic

overcoming-exploration-from-demos's People

Contributors

jangirrishabh avatar ovidr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

overcoming-exploration-from-demos's Issues

Error with demos

Can you send your files with demonstrations? And please explain how to debug this:

File "/home/khurram/DL_skoltech/Research_hand/HER_overcoming_exploration/experiment/ddpg.py", line 155, in initDemoBuffer
demoData = np.load(demoDataFile)
File "/home/khurram/.local/lib/python3.6/site-packages/numpy/lib/npyio.py", line 422, in load
fid = open(os_fspath(file), "rb")
IsADirectoryError: [Errno 21] Is a directory: '/home/khurram/DL_skoltech/'

the number of CPU cores to use

When I set the --num_cpu > 1, I have to wait for a long time at this line:

2018-12-26 11:20:50.480133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-12-26 11:20:50.480139: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-12-26 11:20:50.480292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6899 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:03:00.0, compute capability: 6.1)
Training...

It seems like the program doesn't enter the training phase.
Is there any especially setting to use cpu more than 1 ?

Learning Time

How long do you train the barret warm for pick and place action using this method?
the gym-gazebo is quite slow for me to implement the learning algorithm

ValueError: Object arrays cannot be loaded when allow_pickle=False

Hello, I am trying to train a robot pick and place using HER + DDPG and BC loss.

I have modified her.config.py to use BC loss and Q-Filter. The number of demonstrations set is 50, as the ones I recorded.

I recorded demonstrations using a python script and saved the file with:

fileName = "pick_rail_norandom"
fileName += ".npz"
np.savez_compressed(fileName, acs=actions, obs=observations, info=infos)

I know the file is ok because it is the same of another work that used baselines with no problems at all. I also checked that the agent performed well while recording, looking at the environment and trajectories recorded.

When I try to start training with MPI, specifically this command:

mpirun -n 4 python3 -m baselines.run --alg=her --env=dVRLPick-v0 --num_timesteps=1e6 --demo_file=/home/neri/Desktop/RL4dVRK/dVRL_simulator/record_demonstration_dVRL/pick_rail_norandom.npz --save_path=~/models/pickrail_norandom --log_path=~/logs/pickrail_norandom

the training exits with:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/run.py", line 255, in <module>
    main(sys.argv)
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/run.py", line 221, in main
    model, env = train(args, extra_args)
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/run.py", line 85, in train
    **alg_kwargs
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/her/her.py", line 177, in learn
    policy_save_interval=policy_save_interval, demo_file=demo_file)
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/her/her.py", line 35, in train
    if policy.bc_loss == 1: policy.init_demo_buffer(demo_file) #initialize demo buffer if training with demonstrations
  File "/home/neri/Desktop/Reach_Rail/baselines/baselines/her/ddpg.py", line 166, in init_demo_buffer
    demo_data_obs = demoData['obs']
  File "/home/neri/.local/lib/python3.6/site-packages/numpy/lib/npyio.py", line 255, in __getitem__
    pickle_kwargs=self.pickle_kwargs)
  File "/home/neri/.local/lib/python3.6/site-packages/numpy/lib/format.py", line 727, in read_array
    raise ValueError("Object arrays cannot be loaded when "
ValueError: Object arrays cannot be loaded when allow_pickle=False
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[59978,1],0]
  Exit code:    1
--------------------------------------------------------------------------

Any advice would be highly appreciated.

Also: is there a difference in setting num_env > 1 or using MPI with some number of processes specified?

Best regards

ValueError: not allowed to raise maximum limit

when i run the following command:
python experiment/train.py,

there is an following error:
File "experiment/train.py", line 244, in main
launch(**kwargs)
File "experiment/train.py", line 132, in launch
resource.setrlimit(resource.RLIMIT_NOFILE, (65536, 65536))
ValueError: not allowed to raise maximum limit

is there any solution?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.