Comments (7)
@dhfromkorea hmm, that's interesting. I've run this code on mac and ubuntu without any issue. The fast_cts code is significantly faster; I'll have to double check on the exact speedup, but I I would recommend against trying to use the pure python version. 'resize' is coming from the scikit-image library though; I wonder if there could be a problem with that installation?
from tensorflow-rl.
Hm... okay, so it's probably about compilation of the extension...
resize' is coming from the scikit-image library though
-> it worked fine when using the original cts ... so I doubt it's the cause.
I would recommend against trying to use the pure python version.
-> hm... how do I know if this is the case? I thought I am importing the compiled version of the extension ...
from tensorflow-rl.
Can you let me know the python and library versions that you're using?
Also, when you tried the the original cts model can you describe precisely which changes you made to the code?
from tensorflow-rl.
Hi, I am using python 3.5 and cython 0.25.2. Here you can check the code: https://github.com/RL-ninja/beating-montezuma/blob/master/algorithms/paac_cts.py
FYI, I have not changed the cython code at all. The error happens when I consume the code as it is.
A correction: the line causing the error was not resize(). It appears having to do with cts_update
:
cdef double cts_update(CTSStruct* cts, int[:] context, int symbol):
cts[0]._time += 1.0
cts[0].log_alpha = log(1.0 / (cts[0]._time + 1.0))
cts[0].log_1_minus_alpha = log(cts[0]._time / (cts[0]._time + 1.0))
cdef double log_prob = node_update(cts[0]._root, context, symbol)
return log_prob
Along the call trace, the seg fault happens either in node_update
or further down.
cdef double node_update(CTSNodeStruct* node, int[:] context, int symbol):
lp_estimator = estimator_update(node[0].estimator, symbol)
# If not a leaf node, recurse, creating nodes as needed.
cdef CTSNodeStruct* child
cdef double lp_child
cdef double lp_node
if context.shape[0] > 0:
child = node_get_child(node, context[context.shape[0]-1])
lp_child = node_update(child, context[:context.shape[0]-1], symbol)
lp_node = node_mix_prediction(node, lp_estimator, lp_child)
node_update_switching_weights(node, lp_estimator, lp_child)
return lp_node
else:
node[0]._log_stay_prob = 0.0
return lp_estimator
I am in the process of setting up gdb for cython so I can trace down the error.
For some reason, the seg fault error consistently occurs at a specific "factor" which is [i=1, j=26] where height and width are 42.
I hope to figure out the bug (most likely my bad.. :-) asap...
from tensorflow-rl.
to clarify what I mean by [i=1, j=26], I do:
cpdef (double, double) _update(self, int[:, :] obs):
...
for i in range(self.height):
for j in range(self.width):
...
print(i, j)
log_prob += cts_update(&self.cts_factors[i][j], context, obs[i, j])
log_recoding_prob += cts_log_prob(&self.cts_factors[i][j], context, obs[i, j])
return log_prob, log_recoding_prob
And I get consistently:
(1, 24)
(1, 25)
(1, 26)
[1] 3918 segmentation fault python train.py -g montezuma_revenge -df logs/montezuma/ -algo paac_cts
Funny if I change the game to Breakout, I get, again, consistently:
(1, 8)
(1, 9)
(1, 10)
[1] 3623 segmentation fault python train.py -g breakout -df logs/breakout -algo paac_cts
from tensorflow-rl.
Hi, I got it fixed. The problem was the state (game screen image) came as being of a uint8 type... and I just had to convert it to float64.
Thanks for bearing the stupid question. Thanks again for your advice and insight!
UPDATE: with fast_cts, I am getting 272.2787954246189 steps/s avg. (with 4-frame skipping)
This still feels very slow, compared to what I normally get without cts: 2000 steps/s.
Do you have any thought how we can make this even faster? (e.g. running on a GPU)
from tensorflow-rl.
@dhfromkorea I could imagine parallelizing the per-pixel CTS model updates, but as there are really no matrix operations going on I'm not sure that running it on a GPU would be a huge help. On my system I found that the Double DQN updates were actually a more significant bottleneck than the density model updates. If you comment out the density model update in my implementation what sort of performance do you get?
Also, sorry for the delay-- I've been travelling and haven't had much time on my computer before ICML
from tensorflow-rl.
Related Issues (19)
- clang error when running setup.py HOT 2
- a question on the implementation of exploration bonus HOT 3
- about fast_cts.pyx HOT 1
- Explanation on DQN needed HOT 2
- About actor_learner.py
- reproducing your stellar result on Montzuma's Revenge HOT 9
- Tensorboard summaries not appearing HOT 1
- Learning rate calculation of adam wrong? HOT 3
- Training slowing down dramatically HOT 1
- 2to3 python 2 -> 3 upgrade HOT 2
- PseudoCountQLearner
- Cannot install(swig.exe error)
- Code hardwired to Atari and emulator environments HOT 1
- Can't use MountainCarContinuous-v0 with trpo-continuous HOT 2
- Wrong check for 'reward_threshold' property in cem_actor_learner.py HOT 1
- Missing logger in value_based_actor_learner.py
- Can't test CartPole-v0 model trained with TRPO HOT 3
- no 'target_network' for A3CLearner when testing HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorflow-rl.