Giter Site home page Giter Site logo

Policy initialization about reinforce.jl HOT 6 OPEN

juliaml avatar juliaml commented on May 27, 2024
Policy initialization

from reinforce.jl.

Comments (6)

tbreloff avatar tbreloff commented on May 27, 2024

I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you write
out a little example code of how you see initializing policies? Looking
forward to what you come up with.

On Sunday, July 31, 2016, Marcus Appelros [email protected] wrote:

Require a way to conveniently manually provide initial knowledge for a
policy.

For example, say we have a hexagonal grid of which we are tasked to choose
a sequence in which it is certainly never correct to take the first pick
right at the grid edges, with a getter+setter we can both view the previous
edge probabilities and set them to zero.

Is such functionality in line with the intended directions?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the thread
https://github.com/notifications/unsubscribe-auth/AA492njPHEqSC2jN_LR2gRvLFUQAifpKks5qbCnbgaJpZM4JY9Ru
.

from reinforce.jl.

jhlq avatar jhlq commented on May 27, 2024

The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples they can
easily plug their favorite library for supervised learning into the setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" [email protected] wrote:

I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you write
out a little example code of how you see initializing policies? Looking
forward to what you come up with.

On Sunday, July 31, 2016, Marcus Appelros [email protected]
wrote:

Require a way to conveniently manually provide initial knowledge for a
policy.

For example, say we have a hexagonal grid of which we are tasked to
choose
a sequence in which it is certainly never correct to take the first pick
right at the grid edges, with a getter+setter we can both view the
previous
edge probabilities and set them to zero.

Is such functionality in line with the intended directions?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AA492njPHEqSC2jN_LR2gRvLFUQAifpKks5qbCnbgaJpZM4JY9Ru

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGCB2vQ8r-Xk5XAvDEXjl7BkK0AxnzN3ks5qbIrxgaJpZM4JY9Ru
.

from reinforce.jl.

tbreloff avatar tbreloff commented on May 27, 2024

I think that, without sample code, I'll have a hard time understanding what
a "getter/setter" is. Do you mean a lookup table for states and actions? If
so, my interest lies much more in RL through function approximation, so I
don't have much need for table lookup apis (though it could certainly be
supported if others want that).

On Sunday, July 31, 2016, Marcus Appelros [email protected] wrote:

The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples they can
easily plug their favorite library for supervised learning into the setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you
write
out a little example code of how you see initializing policies? Looking
forward to what you come up with.

On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:

Require a way to conveniently manually provide initial knowledge for a
policy.

For example, say we have a hexagonal grid of which we are tasked to
choose
a sequence in which it is certainly never correct to take the first
pick
right at the grid edges, with a getter+setter we can both view the
previous
edge probabilities and set them to zero.

Is such functionality in line with the intended directions?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the
thread
<

https://github.com/notifications/unsubscribe-auth/AA492njPHEqSC2jN_LR2gRvLFUQAifpKks5qbCnbgaJpZM4JY9Ru

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<
https://github.com/tbreloff/Reinforce.jl/issues/1#issuecomment-236425611>,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AGCB2vQ8r-Xk5XAvDEXjl7BkK0AxnzN3ks5qbIrxgaJpZM4JY9Ru

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA492nQg-W3pkNtr5qc0MWiPJzRfsBG4ks5qbNKwgaJpZM4JY9Ru
.

from reinforce.jl.

jhlq avatar jhlq commented on May 27, 2024

Let's say our child is practicing math and we have prepared a challenging
problem. The getter would be asking what they think the answer is and the
setter is telling them the answer.
On 31 Jul 2016 19:28, "Tom Breloff" [email protected] wrote:

I think that, without sample code, I'll have a hard time understanding what
a "getter/setter" is. Do you mean a lookup table for states and actions? If
so, my interest lies much more in RL through function approximation, so I
don't have much need for table lookup apis (though it could certainly be
supported if others want that).

On Sunday, July 31, 2016, Marcus Appelros [email protected]
wrote:

The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples they
can
easily plug their favorite library for supervised learning into the
setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you
write
out a little example code of how you see initializing policies? Looking
forward to what you come up with.

On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:

Require a way to conveniently manually provide initial knowledge for
a
policy.

For example, say we have a hexagonal grid of which we are tasked to
choose
a sequence in which it is certainly never correct to take the first
pick
right at the grid edges, with a getter+setter we can both view the
previous
edge probabilities and set them to zero.

Is such functionality in line with the intended directions?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the
thread
<

https://github.com/notifications/unsubscribe-auth/AA492njPHEqSC2jN_LR2gRvLFUQAifpKks5qbCnbgaJpZM4JY9Ru

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<
#1 (comment)
,
or mute the thread
<

https://github.com/notifications/unsubscribe-auth/AGCB2vQ8r-Xk5XAvDEXjl7BkK0AxnzN3ks5qbIrxgaJpZM4JY9Ru

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
https://github.com/tbreloff/Reinforce.jl/issues/1#issuecomment-236441150>,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AA492nQg-W3pkNtr5qc0MWiPJzRfsBG4ks5qbNKwgaJpZM4JY9Ru

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGCB2sa3CFGebaevB9yw8rjXcHqIQ2VPks5qbNtKgaJpZM4JY9Ru
.

from reinforce.jl.

tbreloff avatar tbreloff commented on May 27, 2024

So that's not really reinforcement learning. You should check out our
effort in JuliaML if you're more interested in more general machine
learning. In RL there are no "answers", only rewards.

On Sunday, July 31, 2016, Marcus Appelros [email protected] wrote:

Let's say our child is practicing math and we have prepared a challenging
problem. The getter would be asking what they think the answer is and the
setter is telling them the answer.
On 31 Jul 2016 19:28, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

I think that, without sample code, I'll have a hard time understanding
what
a "getter/setter" is. Do you mean a lookup table for states and actions?
If
so, my interest lies much more in RL through function approximation, so I
don't have much need for table lookup apis (though it could certainly be
supported if others want that).

On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:

The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to
save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples they
can
easily plug their favorite library for supervised learning into the
setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
<javascript:_e(%7B%7D,'cvml','[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');');>> wrote:

I would say that I haven't settled on a policy api yet... I've been a
little more focused on the environments. If you have time, could you
write
out a little example code of how you see initializing policies?
Looking
forward to what you come up with.

On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
<javascript:_e(%7B%7D,'cvml','[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');');>>
wrote:

Require a way to conveniently manually provide initial knowledge
for
a
policy.

For example, say we have a hexagonal grid of which we are tasked to
choose
a sequence in which it is certainly never correct to take the first
pick
right at the grid edges, with a getter+setter we can both view the
previous
edge probabilities and set them to zero.

Is such functionality in line with the intended directions?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the
thread
<

https://github.com/notifications/unsubscribe-auth/AA492njPHEqSC2jN_LR2gRvLFUQAifpKks5qbCnbgaJpZM4JY9Ru

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<

#1 (comment)
,

or mute the thread
<

https://github.com/notifications/unsubscribe-auth/AGCB2vQ8r-Xk5XAvDEXjl7BkK0AxnzN3ks5qbIrxgaJpZM4JY9Ru

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
#1 (comment)
,
or mute the thread
<

https://github.com/notifications/unsubscribe-auth/AA492nQg-W3pkNtr5qc0MWiPJzRfsBG4ks5qbNKwgaJpZM4JY9Ru

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<
https://github.com/tbreloff/Reinforce.jl/issues/1#issuecomment-236443872>,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AGCB2sa3CFGebaevB9yw8rjXcHqIQ2VPks5qbNtKgaJpZM4JY9Ru

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AA492m8FFEIwojCH6oSr64DzlwmFgLDEks5qbNxggaJpZM4JY9Ru
.

from reinforce.jl.

jhlq avatar jhlq commented on May 27, 2024

Yes, as mentioned this is supervised and people would be able to plug their
favorite ML library.

Connecting the two is the goal, schools don't let students work entirely on
their own and neither do teachers lead them through every single problem. A
mix allows the AI to explore on its own with intermittent interventions
from more knowledgeable intelligences.

Reinforcement learning is key for robust AI and like mixing a metal with
trace elements can create strong alloys so will adding specks of
supervision significantly hasten progress.
On 31 Jul 2016 19:44, "Tom Breloff" [email protected] wrote:

So that's not really reinforcement learning. You should check out our
effort in JuliaML if you're more interested in more general machine
learning. In RL there are no "answers", only rewards.

On Sunday, July 31, 2016, Marcus Appelros [email protected]
wrote:

Let's say our child is practicing math and we have prepared a challenging
problem. The getter would be asking what they think the answer is and the
setter is telling them the answer.
On 31 Jul 2016 19:28, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

I think that, without sample code, I'll have a hard time understanding
what
a "getter/setter" is. Do you mean a lookup table for states and
actions?
If
so, my interest lies much more in RL through function approximation,
so I
don't have much need for table lookup apis (though it could certainly
be
supported if others want that).

On Sunday, July 31, 2016, Marcus Appelros <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:

The getters are straight forward, just query the policy as usual. The
setters are a form of supervised learning so it would make sense to
save
every set value as a training example, then we can have a basic
implementation and if a user builds up a large library of samples
they
can
easily plug their favorite library for supervised learning into the
setter
system.
On 31 Jul 2016 13:45, "Tom Breloff" <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
<javascript:_e(%7B%7D,'cvml','[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');');>> wrote:

I would say that I haven't settled on a policy api yet... I've
been a
little more focused on the environments. If you have time, could
you
write
out a little example code of how you see initializing policies?
Looking
forward to what you come up with.

On Sunday, July 31, 2016, Marcus Appelros <
[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');
<javascript:_e(%7B%7D,'cvml','[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');');>>
wrote:

Require a way to conveniently manually provide initial knowledge
for
a
policy.

For example, say we have a hexagonal grid of which we are tasked
to
choose
a sequence in which it is certainly never correct to take the
first
pick
right at the grid edges, with a getter+setter we can both view
the
previous
edge probabilities and set them to zero.

Is such functionality in line with the intended directions?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1, or mute the
thread
<

https://github.com/notifications/unsubscribe-auth/AA492njPHEqSC2jN_LR2gRvLFUQAifpKks5qbCnbgaJpZM4JY9Ru

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<

#1 (comment)
,

or mute the thread
<

https://github.com/notifications/unsubscribe-auth/AGCB2vQ8r-Xk5XAvDEXjl7BkK0AxnzN3ks5qbIrxgaJpZM4JY9Ru

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<

#1 (comment)
,

or mute the thread
<

https://github.com/notifications/unsubscribe-auth/AA492nQg-W3pkNtr5qc0MWiPJzRfsBG4ks5qbNKwgaJpZM4JY9Ru

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<
#1 (comment)
,
or mute the thread
<

https://github.com/notifications/unsubscribe-auth/AGCB2sa3CFGebaevB9yw8rjXcHqIQ2VPks5qbNtKgaJpZM4JY9Ru

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
https://github.com/tbreloff/Reinforce.jl/issues/1#issuecomment-236444186>,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AA492m8FFEIwojCH6oSr64DzlwmFgLDEks5qbNxggaJpZM4JY9Ru

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGCB2sKlC8AbAx9Ofsi0n1By2cVSsgw0ks5qbN7-gaJpZM4JY9Ru
.

from reinforce.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.