Comments (8)
Keep in mind that epsilon
begins at 1 and decays by 99.5% it's value every 32 time steps as given by batch_size
. The policy given by the act method will be random most of the time and can take several episode before getting below 50%.
from deep-q-learning.
with a batch size of 128 and the same epsilon decay rate this repo shows impressive result when the model is reloaded. i ran the model several times and it showed good result all the time. The model was trained for less than 1000 episodes
from deep-q-learning.
It looks like the difference is that keon's code saves and loads the weights whereas your altered code saves and loads the model. I don't have much experience with keras but I would assumed the value of epsilon is reset to 1 when loading the weights and initializing the model
from deep-q-learning.
yes i was having a hard time with saving the weights so i switched to saving the model instead, with this the results were much better and i recommend the same . i cannot really comment on your assumption though
from deep-q-learning.
Adibyte95, Saving model is similar to save weight (because the numbers of nodes are constant). Is there anything more? Why is made your code more accurate than the code of Keon?
from deep-q-learning.
I am not sure ....maybe initial weights and not partially trained weights are loaded
from deep-q-learning.
@keon Is it similar to DDQN? (You mean by DDQN is double or duel DQN?) Saving model is similar to save weight (because the numbers of nodes are constant). Is there anything more? Why is made your code more accurate than the code of Keon?
from deep-q-learning.
ddqn is double here since there is no dueling implementation in the repo.
The hyperparameters affect the performance of a model greatly.
When comparing implementations please make sure to fix seed samples and hyperparameters.
from deep-q-learning.
Related Issues (20)
- What am I doing wrong? HOT 2
- a hidden bug in your code HOT 3
- missing the initialization of target action value and refreshing the Qhat HOT 1
- memory for state HOT 4
- Saving/reloading weight does not seem to work HOT 2
- should update the weight every time step ? HOT 3
- Speeding the replay HOT 2
- k frame
- What is the purpose of "done"? HOT 2
- ddqn_batch
- Would it make sense to restrict the action to what's possible? HOT 2
- Predict the action for new environment - Inference HOT 1
- IndexError
- slower on GPU using tensorflow backend? HOT 1
- Why are we training the neural network for only 1 epoch HOT 1
- Question: Is this some form of reward engineering? HOT 1
- Possible incrrection in DQN & DDQN file
- ValueError: cannot reshape array of size 2 into shape (1,4) HOT 2
- Not learning
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deep-q-learning.