Thanks a lot for this Denny, have learnt a lot from your blogs. The literature on FA (

Effects of experience replay haven't been documented. about reinforcement-learning HOT 4 CLOSED

dennybritz commented on August 27, 2024

Effects of experience replay haven't been documented.

from reinforcement-learning.

Comments (4)

dennybritz commented on August 27, 2024

Troubles getting my SGD approximators to converge with state + action as my features. Is there any particular reason why having a separate FA for each action does better?

It depends on how you encode the action into your features. If you have a separate set of features for each action these two should be almost equivalent. The only reason I created different models for each action is because that was easier to implement with sklearn.

Additionally in your experience, what non-incremental learning methods like ExtraTreesRegressor etc work well with such problems?

I'm not really familiar with that, but it seems like you need incremental updates (ala SGD) to make these algorithms work well/fast.

Also although stated in the Read.me, the notebooks haven't used experience replay with SGD.

As far as my understanding goes experience replay is necessary to make nonlinear function approximation (like Neural Networks) work, but it's not strictly necessary for linear FA which seems to work even without Experience Replay. The DQN implementation does have experience replay and I don't think it's worth adding it to the linear approximators because they already work well and because it would make the code less clear.

from reinforcement-learning.

switchfootsid commented on August 27, 2024

Thanks for getting back Denny.

I always thought experience replay is motivated by the need to break correlations in the training data (temporal correlations sorts) collected during interaction with the environment while following some epsilon-greedy policy, thus leading to faster convergence of the function approximator. By that logic it looks like a more general purpose hack even for training incremental linear function approximators in control tasks (other than mountain car). Basically more like SGD vs mini-batch SGD. What do you think?

from reinforcement-learning.

dennybritz commented on August 27, 2024

Yes, that's right. I think it general purpose and may improve convergence time even for linear approximators - it just doesn't seem to be absolutely necessary to make them converge.

from reinforcement-learning.

IbrahimSobh commented on August 27, 2024

I agree that using experience replay is helpful anyway. According to : https://arxiv.org/abs/1602.01783

Incorporating experience replay into the asynchronous reinforcement learning framework could improve the data efficiency of these methods by reusing old data. This could in turn lead to much faster training times.

from reinforcement-learning.

Recommend Projects

Effects of experience replay haven't been documented. about reinforcement-learning HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent