Giter Site home page Giter Site logo

Comments (4)

Bowen-He avatar Bowen-He commented on August 21, 2024 1

@ku2482
Hi! I used the entropy loss, and code just runs well. And I still have a question about the returns. Could you please tell me why the train return is so far from the test return. I checked train_episode() and evaluate() in base_agent.py, but I couldn't get the explaination here.
捕获

from fqf-iqn-qrdqn.pytorch.

toshikwa avatar toshikwa commented on August 21, 2024

@Bowen-He

In a personal communication, the author mentions the following. Your problem seems to be related to it.

We did not use the regularization term in our implementation. The reason why we mention this term is because that in some rare case (about 1 out of 20 seeds), the proposed fraction may degenerates into a deterministic one (e.g. [0,0,0,...,1,0,...]).

In FQF paper, the author proposed to use the entropy bonus (regularization term) to prevent it.
I implemented it so that you can use the entropy bonus setting ent_coef bigger than 0 in the config.
(I haven't tested it due to the limited resources.)

Thanks.

from fqf-iqn-qrdqn.pytorch.

Bowen-He avatar Bowen-He commented on August 21, 2024

@ku2482
Thank you! I'll do experiments immediately, and see if there is any change. I've done with seed 0 and seed 5, but they all fall into the same situation eventually. Hope a regularizer could help!

from fqf-iqn-qrdqn.pytorch.

toshikwa avatar toshikwa commented on August 21, 2024

@Bowen-He
Thank you for sharing the result ;)

We clip the reward when training.
https://github.com/ku2482/fqf-iqn-qrdqn.pytorch/blob/master/fqf_iqn_qrdqn/env.py#L166

from fqf-iqn-qrdqn.pytorch.

Related Issues (17)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.