Giter Site home page Giter Site logo

onetimepad / advantage Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 383 KB

A framework for making RL easy!

Python 99.02% Shell 0.98%
reinforcement-learning deep-reinforcement-learning deep-learning ai python3 tensorflow machine-learning openai-gym framework advantage

advantage's People

Contributors

onetimepad avatar

Stargazers

 avatar  avatar

Watchers

 avatar

advantage's Issues

DQN train_iteration fix

The train_iteration method for DeepQModel has two if statements one for improve_policy_modulo and improve_target_modulo. If policy improvement happens, target improvement maybe shouldn't happen after ? (like one is a multiple of the other)

OOP paradigm: separate uses of approximators_builder and base_approximators

The "config" parameter in base_approximators seems to bind approximators_builder and base_approximators together when the config should only be handled by approximators_builder. The proper pattern is used in agents_builder and base_agents. There might need to be separate "builder" for each approximator similar to agents. Basically, the config param shouldn't be passed around.

Approximators inference

There can be multiple feed_dict elements to for an approximator but only one concatenated input tensor. Have a way to keep track of all of them. Change the way inference() works.

Wrapper in DiscreteActionSpaceAgent awkward

There is a wrapper (_action_wrapper) in DiscreteActionSpaceAgent. This wrapper is used to extract the one element np.array returned by the DQNAgent or any Discrete and ActionValue agent. However, it being placed in DiscreteActionSpaceAgent is a bit awkward, since technically an ActionValue agent could be continuous and a DiscreteAgent doesn't necessarily return an action that requires such a wrapper.

There should be a better fix for this...maybe a DiscreteActionValueAgent ?

See base_agents

Configuration for Approximator Inputs

This would allow for more fine-grain control over what inputs go into an Approximator and allow for connection of multiple approximators together reducing the need for writing specific classes that deal with combined approximators.

See approximators.proto and inputs_placeholder in set_up()

Specific Exceptions

Look for TODO's involving making exceptions more specific.

For example: "BadActionExcpetion" or something.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.