Giter Site home page Giter Site logo

dqn_pytorch's People

Contributors

dxyang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

dqn_pytorch's Issues

IndexError

Traceback (most recent call last):
File "main.py", line 120, in
main()
File "main.py", line 117, in main
atari_learn(env, task.env_id, num_timesteps=task.max_timesteps, double_dqn=double_dqn, dueling_dqn=dueling_dqn)
File "main.py", line 72, in atari_learn
dueling_dqn=dueling_dqn
File "/home/sotirisnik/other/DQN_pytorch/learn.py", line 149, in dqn_learning
obs, reward, done, info = env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/other/DQN_pytorch/utils/atari_wrappers.py", line 132, in _step
obs, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/other/DQN_pytorch/utils/atari_wrappers.py", line 124, in _step
obs, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 279, in _step
return self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/other/DQN_pytorch/utils/atari_wrappers.py", line 93, in _step
obs, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 279, in _step
return self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/other/DQN_pytorch/utils/atari_wrappers.py", line 53, in _step
obs, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/wrappers/monitoring.py", line 33, in _step
observation, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/wrappers/time_limit.py", line 36, in _step
observation, reward, done, info = self.env.step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/core.py", line 99, in step
return self._step(action)
File "/home/sotirisnik/anaconda3/envs/my_env/lib/python3.6/site-packages/gym/envs/atari/atari_env.py", line 73, in _step
action = self._action_set[a]
IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices
[2018-06-27 02:58:15,904] Finished writing results. You can upload them to the scoreboard via gym.upload('/home/sotirisnik/other/DQN_pytorch/tmp/BreakoutNoFrameskip-v4')

It stills works? Cause i am not being able to test it.

I know Gym has moved to Gymnasium, but some functions change... I tried to make the transition but i am not getting it. Is it possible you to update so we can test your amazing work? I would be very pleased if so. Thanks

What's the meaning of flip the bellman error?

Thanks for your wonderful code, I only use the logic of your trainig part in my code. But I fount that the effect of my model gets worse with training. When I edit

clipped_error = -1.0 * error.clamp(-1, 1)

to
clipped_error = 1.0 * bellman_error the model works well.
I don't understand why the bellman error needs to be flipped here?

Runtime Error : Index tensor must have same dimensions as input tensor

Hi If i run the code for breakout, i am getting the following error.

Traceback (most recent call last):
File "main.py", line 120, in
main()
File "main.py", line 117, in main
atari_learn(env, task.env_id, num_timesteps=task.max_timesteps, double_dqn=double_dqn, dueling_dqn=dueling_dqn)
File "main.py", line 72, in atari_learn
dueling_dqn=dueling_dqn
File "/home/ashutosh/repos/DQN_pytorch/learn.py", line 229, in dqn_learning
q_s_a.backward(clipped_error.data.unsqueeze(1))
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py", line 99, in backward
variables, grad_variables, retain_graph)
RuntimeError: invalid argument 3: Index tensor must have same dimensions as input tensor at /pytorch/torch/lib/THC/generic/THCTensorScatterGather.cu:199

Dueling dqn equation

Thanks for offering this wonderful code. But I have a question.

  1. Why in the combination part of the equation, the advantage A need to subtract it's average? I've already refer to the paper but still don't understand.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.