Comments (6)
I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you write
out a little example code of how you see initializing policies? Looking
forward to what you come up with.
On Sunday, July 31, 2016, Marcus Appelros [email protected] wrote:
Require a way to conveniently manually provide initial knowledge for a
policy.For example, say we have a hexagonal grid of which we are tasked to choose
a sequence in which it is certainly never correct to take the first pick
right at the grid edges, with a getter+setter we can both view the previous
edge probabilities and set them to zero.Is such functionality in line with the intended directions?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the thread
https://github.com/notifications/unsubscribe-auth/AA492njPHEqSC2jN_LR2gRvLFUQAifpKks5qbCnbgaJpZM4JY9Ru
.
from reinforce.jl.
The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples they can
easily plug their favorite library for supervised learning into the setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" [email protected] wrote:
I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you write
out a little example code of how you see initializing policies? Looking
forward to what you come up with.On Sunday, July 31, 2016, Marcus Appelros [email protected]
wrote:Require a way to conveniently manually provide initial knowledge for a
policy.For example, say we have a hexagonal grid of which we are tasked to
choose
a sequence in which it is certainly never correct to take the first pick
right at the grid edges, with a getter+setter we can both view the
previous
edge probabilities and set them to zero.Is such functionality in line with the intended directions?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AA492njPHEqSC2jN_LR2gRvLFUQAifpKks5qbCnbgaJpZM4JY9Ru.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGCB2vQ8r-Xk5XAvDEXjl7BkK0AxnzN3ks5qbIrxgaJpZM4JY9Ru
.
from reinforce.jl.
I think that, without sample code, I'll have a hard time understanding what
a "getter/setter" is. Do you mean a lookup table for states and actions? If
so, my interest lies much more in RL through function approximation, so I
don't have much need for table lookup apis (though it could certainly be
supported if others want that).
On Sunday, July 31, 2016, Marcus Appelros [email protected] wrote:
The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples they can
easily plug their favorite library for supervised learning into the setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you
write
out a little example code of how you see initializing policies? Looking
forward to what you come up with.On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:Require a way to conveniently manually provide initial knowledge for a
policy.For example, say we have a hexagonal grid of which we are tasked to
choose
a sequence in which it is certainly never correct to take the first
pick
right at the grid edges, with a getter+setter we can both view the
previous
edge probabilities and set them to zero.Is such functionality in line with the intended directions?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the
thread
<.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<
https://github.com/tbreloff/Reinforce.jl/issues/1#issuecomment-236425611>,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AGCB2vQ8r-Xk5XAvDEXjl7BkK0AxnzN3ks5qbIrxgaJpZM4JY9Ru.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA492nQg-W3pkNtr5qc0MWiPJzRfsBG4ks5qbNKwgaJpZM4JY9Ru
.
from reinforce.jl.
Let's say our child is practicing math and we have prepared a challenging
problem. The getter would be asking what they think the answer is and the
setter is telling them the answer.
On 31 Jul 2016 19:28, "Tom Breloff" [email protected] wrote:
I think that, without sample code, I'll have a hard time understanding what
a "getter/setter" is. Do you mean a lookup table for states and actions? If
so, my interest lies much more in RL through function approximation, so I
don't have much need for table lookup apis (though it could certainly be
supported if others want that).On Sunday, July 31, 2016, Marcus Appelros [email protected]
wrote:The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples they
can
easily plug their favorite library for supervised learning into the
setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you
write
out a little example code of how you see initializing policies? Looking
forward to what you come up with.On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:Require a way to conveniently manually provide initial knowledge for
a
policy.For example, say we have a hexagonal grid of which we are tasked to
choose
a sequence in which it is certainly never correct to take the first
pick
right at the grid edges, with a getter+setter we can both view the
previous
edge probabilities and set them to zero.Is such functionality in line with the intended directions?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the
thread
<.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<
#1 (comment)
,
or mute the thread
<.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
https://github.com/tbreloff/Reinforce.jl/issues/1#issuecomment-236441150>,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AA492nQg-W3pkNtr5qc0MWiPJzRfsBG4ks5qbNKwgaJpZM4JY9Ru.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGCB2sa3CFGebaevB9yw8rjXcHqIQ2VPks5qbNtKgaJpZM4JY9Ru
.
from reinforce.jl.
So that's not really reinforcement learning. You should check out our
effort in JuliaML if you're more interested in more general machine
learning. In RL there are no "answers", only rewards.
On Sunday, July 31, 2016, Marcus Appelros [email protected] wrote:
Let's say our child is practicing math and we have prepared a challenging
problem. The getter would be asking what they think the answer is and the
setter is telling them the answer.
On 31 Jul 2016 19:28, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:I think that, without sample code, I'll have a hard time understanding
what
a "getter/setter" is. Do you mean a lookup table for states and actions?
If
so, my interest lies much more in RL through function approximation, so I
don't have much need for table lookup apis (though it could certainly be
supported if others want that).On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to
save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples they
can
easily plug their favorite library for supervised learning into the
setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
<javascript:_e(%7B%7D,'cvml','[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');');>> wrote:I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you
write
out a little example code of how you see initializing policies?
Looking
forward to what you come up with.On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
<javascript:_e(%7B%7D,'cvml','[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');');>>
wrote:Require a way to conveniently manually provide initial knowledge
for
a
policy.For example, say we have a hexagonal grid of which we are tasked to
choose
a sequence in which it is certainly never correct to take the first
pick
right at the grid edges, with a getter+setter we can both view the
previous
edge probabilities and set them to zero.Is such functionality in line with the intended directions?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the
thread
<.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<or mute the thread
<.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
#1 (comment)
,
or mute the thread
<.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<
https://github.com/tbreloff/Reinforce.jl/issues/1#issuecomment-236443872>,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AGCB2sa3CFGebaevB9yw8rjXcHqIQ2VPks5qbNtKgaJpZM4JY9Ru.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA492m8FFEIwojCH6oSr64DzlwmFgLDEks5qbNxggaJpZM4JY9Ru
.
from reinforce.jl.
Yes, as mentioned this is supervised and people would be able to plug their
favorite ML library.
Connecting the two is the goal, schools don't let students work entirely on
their own and neither do teachers lead them through every single problem. A
mix allows the AI to explore on its own with intermittent interventions
from more knowledgeable intelligences.
Reinforcement learning is key for robust AI and like mixing a metal with
trace elements can create strong alloys so will adding specks of
supervision significantly hasten progress.
On 31 Jul 2016 19:44, "Tom Breloff" [email protected] wrote:
So that's not really reinforcement learning. You should check out our
effort in JuliaML if you're more interested in more general machine
learning. In RL there are no "answers", only rewards.On Sunday, July 31, 2016, Marcus Appelros [email protected]
wrote:Let's say our child is practicing math and we have prepared a challenging
problem. The getter would be asking what they think the answer is and the
setter is telling them the answer.
On 31 Jul 2016 19:28, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:I think that, without sample code, I'll have a hard time understanding
what
a "getter/setter" is. Do you mean a lookup table for states and
actions?
If
so, my interest lies much more in RL through function approximation,
so I
don't have much need for table lookup apis (though it could certainly
be
supported if others want that).On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to
save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples
they
can
easily plug their favorite library for supervised learning into the
setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
<javascript:_e(%7B%7D,'cvml','[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');');>> wrote:I would say that I haven't settled on a policy api yet... I've
been a
little more focused on the environments. If you have time, could
you
write
out a little example code of how you see initializing policies?
Looking
forward to what you come up with.On Sunday, July 31, 2016, Marcus Appelros <
[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
<javascript:_e(%7B%7D,'cvml','[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');');>>
wrote:Require a way to conveniently manually provide initial knowledge
for
a
policy.For example, say we have a hexagonal grid of which we are tasked
to
choose
a sequence in which it is certainly never correct to take the
first
pick
right at the grid edges, with a getter+setter we can both view
the
previous
edge probabilities and set them to zero.Is such functionality in line with the intended directions?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the
thread
<.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<or mute the thread
<.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<or mute the thread
<.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<
#1 (comment)
,
or mute the thread
<.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
https://github.com/tbreloff/Reinforce.jl/issues/1#issuecomment-236444186>,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AA492m8FFEIwojCH6oSr64DzlwmFgLDEks5qbNxggaJpZM4JY9Ru.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGCB2sKlC8AbAx9Ofsi0n1By2cVSsgw0ks5qbN7-gaJpZM4JY9Ru
.
from reinforce.jl.
Related Issues (20)
- Interface Consistencies HOT 1
- Finished not exported HOT 1
- Inconsistency in State Type when developing custom Actions HOT 4
- env: add CatePole-v1 HOT 1
- Taking action based on set of states HOT 3
- Support for continuous action space
- Doesn't seem to work in Julia v1.0 HOT 5
- Benchmarks vs tensorflow and pytorch
- example in the exampel folder does not work on Julia v1.0
- Mountain Car Example is not working HOT 2
- Import Plots in Pendulum environment HOT 1
- Add Project.toml HOT 4
- plots not drawn HOT 1
- Documentation
- undefined function in CartPole environment HOT 4
- TagBot trigger issue HOT 1
- Vector of Intervals error message for actions(env, s)
- archive the package? HOT 3
- type annotations
- Register in METADATA HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from reinforce.jl.