Comments (10)
Great question, I want to know too.
When is the best time to give reward for a certain action
from ml-agents.
@dora-gt It would be great if there is a lifecycle flowchart just like what Unity has.
from ml-agents.
You are right, AcademyStep happens before AgentStep. The reward is a public float for all agents. If you want to wait until all agents have done their steps, you can have a script that waits until all agents have performed their steps and reward them accordingly. The rewards are collected at the beginning of the next frame, this mean that you can imagine a scenario in which the rewards are only given after all agents have acted. The best time to give a reward is as soon as possible. Still, you must be careful not override the reward value you assigned before it has been collected. I would recommend using reward += value
rather than reward = value
(the reward is reset to 0 after each step.
On another note, I am rather curious on how you managed to perform training with the internal brain.
from ml-agents.
@vincentpierre Thanks~ I've managed to modify the script to run those in the order I want.
To train the network using TensorflowSharp, I just need to give the train operation a name when build the tensorflow graph . Somthing like
optimizer = tf.train.MomentumOptimizer(learning_rate=lr,momentum = 0.9, use_nesterov=False) update_once = optimizer.minimize(self.loss,name = 'train_once')
And then you can run the operation by name in tensorflowsharp.
It is also possible to run the restore/save checkpoint operation the same way.
from ml-agents.
@tcmxx I am trying t reproduce what you have (training in the editor) I have questions : In C#, do you import a "frozen" graph or just the graph_def without the weights ?
- If I just use the metagraph, my variables are not initialized and I have not figured out how to initialize the variables in C#. [Edit : It seems I can run the global variable initializer from C#, I just need to include it in my graph in python}
- If I use a frozen graph, I get the error
Input 0 of node opt/update_output_0/kernel/ApplyAdam was passed float from output_0/kernel:0 incompatible with expected float_ref.
which means that it cannot modify a constant tensor (I think it makes sense since my weights are frozen constants, they cannot change anymore).
Also, I do not know how to save and load checkpoints in C#, I have not uncovered everything TensorflowSharp can do.
Do you have code somewhere I could look at ? How you generate the tensorflow graph and how you call the training operator from C# ? This is something I think a lot of people would like to have as a feature, could you make a PR ?
from ml-agents.
@vincentpierre
Yes, you can create the init in python and then call it in c#. The save/load is the same. Use things like:
saver = tf.train.Saver()
to define a saver. Then there should be operation/tensors for saving/restorin/path.
you might get the names of those by following:
# The name of the tensor you must feed with a filename when saving/restoring.
print(saver_def.filename_tensor_name)
# The name of the target operation you must run when restoring.
print(saver_def.restore_op_name)
# The name of the target operation you must run when saving.
print(saver_def.save_tensor_name)
I dont have the code online yet. it is too messy. but I've forked your repo and am moving my old stuff to use you Unity ML. I can also put my old codes online if you want to take a look.
Btw, what would you like me to put in a PR?
from ml-agents.
@tcmxx Thank you for your help, I was hopping there was a way to save and load the model in C#. I do not know what a checkpoint would look like if I called the saving operator from C# at runtime. I would be happy to look at your code or pieces of your code. I though you were working on a CoreBrainInternal
in ML-Agents and that you were able to do training with the internal brain (so I wanted to know if you could make a PR of it or of an Example Environment), but I understand this work is in progress.
from ml-agents.
@vincentpierre The checkpoint would be just the same as if you call it from python. I am working on a cleaner version of a demo scene for my professor's class and I can I can show it to you and make a PR before next week.
from ml-agents.
Folks, closing out this issue as it has been inactive for some time. We have since released ML-Agents v0.3, so try it out create a new issue if you run into similar problems.
from ml-agents.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
from ml-agents.
Related Issues (20)
- Registry URL Permission Denied; Attribute Error HOT 2
- Potential Memory leak when using model HOT 3
- [Regression] ObservationWriter.AddList and DiscreteActionOutputApplier.Apply have become inefficient with Sentis 1.2.0+ HOT 2
- Can't use a camera after being used in a camera sensor HOT 3
- Having issue installing ml-agents HOT 4
- Onnx error when saving model. HOT 5
- Can one Agent have two different BehaviorParameters compenents ? HOT 2
- Problem with continuous action space in PettingZoo API HOT 2
- Loosen pettingzoo version requirement HOT 3
- TypeError: list indices must be integers or slices, not HOT 2
- Having issue installing ml-agents HOT 5
- [Documentation] Extrinsic Rewards does not list that it has network settings HOT 2
- Error with numpy while installing mlagents HOT 2
- 2D Grid Sensor HOT 2
- Reward problem in Crawler? HOT 1
- Reinforcement learning with search algorithms HOT 2
- ML Agents simulation extremely choppy HOT 8
- AttributeError: : can't set attribute 'mode' when running env.reset() [Fixed, posting for the record] HOT 1
- Notes on Installing grpcio for arm64 HOT 1
- I want to edit rayperception sensor 3d outputs ! HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ml-agents.