Comments (11)
Great! 👍 Looking forward to this. I am working in a very different environment with a different model as well.
from atari.
Any adverse feelings about using require(opt.env)
? It doesn't feel right providing an executable path as input.
if not opt.env or opt.env == '' then
if opt.game == 'catch' then
opt.env = 'rlenvs.Catch'
opt.envScale = true
else
opt.env = 'rlenvs.Atari'
opt.envScale = false
end
end
local Environment = require(opt.env)
self.env = Environment(opt)
local stateSpec = self.env:getStateSpec()
-- Provide original channels, height and width for resizing from
opt.origChannels, opt.origHeight, opt.origWidth = table.unpack(stateSpec[2])
if opt.envScale then
-- Adjust height and width
opt.height, opt.width = stateSpec[2][2], stateSpec[2][3]
end
from atari.
Prefixing input with rlenvs.
at least means the included code is for the intended purpose.
local Environment = require('rlenvs.' .. opt.env)
from atari.
Yeah this looks like an improvement. Code has changed a little bit on refactor
but not an issue. Would go with the whole opt.env
rather than prefixing with rlenvs.
as people can then use their own arbitrary classes.
from atari.
Could a similar thing then be done for models?
Specifically the convolution layers before the complexity of duel DQN etc. Perhaps a model
or modelConv
param could be specified. It could call nn.Sequential()
, add layers and return net
after net:add(nn.View(convOutputSize))
. Then like above if omitted opt.game
could require a default model depending on paper.
If instead passing around an existing net
reference, perhaps each part of building the network could be configurable separately... Though modelConv
+ modelDuel
is inflexible compared to something where extra model components could be arbitrarily tacked on in an array ['8x8_4x4_3x3conv', 'duel', 'bootstraps']
, ['5x5_5x5conv', 'hidden', 'concat']
.
from atari.
Some approaches include using a proper DSL or simply loading a model as you suggest. I'm leaning towards loading a model as a "modelBody", as using various DQN "heads" is part of this repo. This also allows pretrained weights to be loaded.
For example, a suitable network for the Mountain Car problem would be a Linear layer mapping from 2 inputs to 32 hidden neurons, and hiddenSize
can be set to 32.
from atari.
When it comes to the model end of things, returning a network created from nn.Sequential()
onward, along with results of nn.ClassNLLCriterion()
would match that DSL signature.
However there is this line in the existing implementation where a copy is made if recurrent
is set.
Lines 65 to 67 in e3d6470
Seems you'll want to retain the ability to make changes to net
before passing a reference and adding modelBody
.
from atari.
Again on model end of things, seems Model:preprocess
will need to be extendable through whatever code is provided on a per-environment basis.
Perhaps a base Model class is inherited by a required
custom per-environment class, preprocess
could have customisations added, while also exposing a method to generate the model "body".
Although this could mean an external class having to load one from this project to inherit. Might make more sense the other way around with a parent class here loading an external one for customised methods.
from atari.
Just tested with a -modelBody
option on different modes (sync/async, Catch/Atari) - you can see this in the model
branch. I've also addressed Model:preprocess
, with some info documented in the readme (note that this requires some commits I just pushed to rlenvs
).
Have a look and see, but I think it could be enough to close this issue.
from atari.
Didn't realise you could net:add()
another nn.Sequential()
(the return of createBody
) on like that, much simpler than what I was thinking would be needed.
from atari.
Closing with the merge of model
.
from atari.
Related Issues (20)
- Implement Memory Q-networks
- Implement Retrace(λ)
- Finish prioritised experience replay HOT 2
- Allow non-visual environments
- Can I convert rank-based prioritized experience replay to a python version HOT 2
- Async A3C Network Outputs NaN HOT 4
- Load models like environments HOT 2
- Disagreements with the async paper HOT 2
- Possible improvements on speeding up HOT 1
- problem in Agent.lua HOT 1
- gnuplots memory unreleased HOT 1
- Why is the current sharedRmsprop thread safe? HOT 2
- Implement optimality tightening HOT 8
- What is the actual performance? HOT 7
- Refactor DQN train function into separate functions
- Partition number and segments HOT 1
- How to process with the salient map? HOT 4
- actor-critic based HOT 2
- About A3C HOT 1
- Questions about training A3C HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from atari.