Giter Site home page Giter Site logo

Comments (14)

dgriff777 avatar dgriff777 commented on May 18, 2024 1

in contrast though in Breakout-v0 its scoring over 400 in 4-5hrs which is far faster than other model on 32 threads

from rl_a3c_pytorch.

dgriff777 avatar dgriff777 commented on May 18, 2024

well to start we have different input sizes. His is 42x42 and mine is 80x80. his model is exact replica of universe starter agent. That model is good but obviously very fine tuned for Pong specifically. Im using a 4 layers conv2d model with 32 filters of size 5 × 5, 32 filters of size 5 × 5, 64 filters of size 4 × 4, and 32 filters of size 3 × 3 with single strides for all and max pooling on each. Im also using a 512 LSTM Cell as opposed to 256 last cell. Also have RMSprop shared optimizer implemented. My model obviously larger so slower to train but more robust and much higher final performance as designed for the tough gym v0 environments

from rl_a3c_pytorch.

slowbull avatar slowbull commented on May 18, 2024

Thanks ! In your experiment, does RMSprop shared optimizer works better than Adam?

from rl_a3c_pytorch.

dgriff777 avatar dgriff777 commented on May 18, 2024

They are actually quite different considering both A3C LSTM obviously

from rl_a3c_pytorch.

dgriff777 avatar dgriff777 commented on May 18, 2024

I fine tuned the Adam more so been using that to train but with some tinkering on RMSprop it should give similar results from the few times I played with it. The Adam epsilon default was must change. Big improvement from just that

from rl_a3c_pytorch.

slowbull avatar slowbull commented on May 18, 2024

Thanks for your quick reply!

from rl_a3c_pytorch.

dgriff777 avatar dgriff777 commented on May 18, 2024

They both show benefit of being more robust and steadying factor to learning compared to non shared

from rl_a3c_pytorch.

ethancaballero avatar ethancaballero commented on May 18, 2024

@dgriff777 @ppwwyyxx Why did increasing Adam epsilon from 1e-8 to 1e-3 help? Purpose of epsilon is to prevent division by zero by adding it to denominator. 1e-8 is already large enough to prevent 0 division (I think), so changing to 1e-3 would just add more arbitrary bias.

from rl_a3c_pytorch.

dgriff777 avatar dgriff777 commented on May 18, 2024

The default epsilon for Adam is often not best choice in my experience. As to why in this case it works better several things could be of cause but its hyper parameter searching which always has a fuzzy factor.

from rl_a3c_pytorch.

slowbull avatar slowbull commented on May 18, 2024

How long does it take to train Pong-v0? I used 16 threads, and after 7 hours, episode reward is about 10, far slower and worse than the original network.

from rl_a3c_pytorch.

dgriff777 avatar dgriff777 commented on May 18, 2024

Well as I said before the universal starter agent/ikostrikov/pytorch-a3c is highly optimized for Pong. Also that model uses 42x42 input while mine is 80x80 which means more data to crunch and its also larger more robust model so that it can perform well on all games in Atari not just Pong which is also quite simple. For Pong-v0 its gonna take about 6-7hrs to start scoring 21pts as opposed to other model which is around 2hrs I believe but my model has better overall performance limit

from rl_a3c_pytorch.

dgriff777 avatar dgriff777 commented on May 18, 2024

thats for 32 threads. Have not trained it on 16threads but rough estimate would be around 10hrs for 16threads at most I believe

from rl_a3c_pytorch.

slowbull avatar slowbull commented on May 18, 2024

After about 8 hours, I got expected results on Pong-v0. Thanks!

from rl_a3c_pytorch.

dgriff777 avatar dgriff777 commented on May 18, 2024

You welcome :)

from rl_a3c_pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.