Giter Site home page Giter Site logo

Comments (5)

dementrock avatar dementrock commented on June 24, 2024

Hi @singulaire, sorry for this bug. Indeed we only tested ddpg with flattened observations. This should be fixed by flattening the observations and actions before adding them to the pool. For example, this line can be replaced with

pool.add_sample(
    self.env.observation_space.flatten(observation),
    self.env.action_space.flatten(action),
    reward * self.scale_reward,
    terminal
)

Could you see if this works? If so, I'm happy to accept a pull request or fix it on my end.

from rllab.

singulaire avatar singulaire commented on June 24, 2024

I was able to get the example to work by using the fix plus a few other flatten operations and will make a pull request presently. That said, while this allows the code to execute without breaking, flattening means we aren't taking advantage of structure in image data.
I think a more long term solution would be to change the various network-based classes so that they can be given a custom network as an argument, as was done in bc1b506. The changes that were needed for GaussianMLPPolicy are fairly small. With this capability available, it would be easy to make CNNs with the ConvNetwork class, which should be well suited to image data.

What do you think? I will gladly look into it if you believe this is a good addition to the project.

from rllab.

dementrock avatar dementrock commented on June 24, 2024

Thanks, and glad it worked! I agree that the network-based classes should be more flexible. What further changes are required for the GaussianMLPPolicy? Or did you mean the DeterministicMLPPolicy class? Feel free to also include the necessary changes in the pull request!

from rllab.

singulaire avatar singulaire commented on June 24, 2024

I meant the DeterministicMLPPolicy, but the ContinuousMLPQFunction class could also benefit from the same treatment. It also may or may not be desirable to have a mix of CNNs which accept image input and fully connected networks which take flattened inputs, but that would require more complex logic (e.g. a "FLATTEN" flag for each object in charge of flattening). Finally, I ran into some additional problems with non-flattened input which I couldn't solve easily, probably due to insufficient familiarity with Theano.

All in all, I included just the flattening logic in #23, so that DDPG works with image observations, although it doesn't take advantage of structure in image data.

from rllab.

dementrock avatar dementrock commented on June 24, 2024

Closing this since #23 is merged. Thanks!

from rllab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.