Giter Site home page Giter Site logo

Comments (4)

dennybritz avatar dennybritz commented on August 27, 2024

Troubles getting my SGD approximators to converge with state + action as my features. Is there any particular reason why having a separate FA for each action does better?

It depends on how you encode the action into your features. If you have a separate set of features for each action these two should be almost equivalent. The only reason I created different models for each action is because that was easier to implement with sklearn.

Additionally in your experience, what non-incremental learning methods like ExtraTreesRegressor etc work well with such problems?

I'm not really familiar with that, but it seems like you need incremental updates (ala SGD) to make these algorithms work well/fast.

Also although stated in the Read.me, the notebooks haven't used experience replay with SGD.

As far as my understanding goes experience replay is necessary to make nonlinear function approximation (like Neural Networks) work, but it's not strictly necessary for linear FA which seems to work even without Experience Replay. The DQN implementation does have experience replay and I don't think it's worth adding it to the linear approximators because they already work well and because it would make the code less clear.

from reinforcement-learning.

switchfootsid avatar switchfootsid commented on August 27, 2024

Thanks for getting back Denny.

I always thought experience replay is motivated by the need to break correlations in the training data (temporal correlations sorts) collected during interaction with the environment while following some epsilon-greedy policy, thus leading to faster convergence of the function approximator. By that logic it looks like a more general purpose hack even for training incremental linear function approximators in control tasks (other than mountain car). Basically more like SGD vs mini-batch SGD. What do you think?

from reinforcement-learning.

dennybritz avatar dennybritz commented on August 27, 2024

Yes, that's right. I think it general purpose and may improve convergence time even for linear approximators - it just doesn't seem to be absolutely necessary to make them converge.

from reinforcement-learning.

IbrahimSobh avatar IbrahimSobh commented on August 27, 2024

I agree that using experience replay is helpful anyway. According to : https://arxiv.org/abs/1602.01783

Incorporating experience replay into the asynchronous reinforcement learning framework could improve the data efficiency of these methods by reusing old data. This could in turn lead to much faster training times.

from reinforcement-learning.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.