Hello, I am a little confused about the order of you calling those functions inter

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Order of Academy Step ad Agent Step about ml-agents HOT 10 CLOSED

unity-technologies commented on August 16, 2024 2

Order of Academy Step ad Agent Step

from ml-agents.

Comments (10)

dora-gt commented on August 16, 2024

Great question, I want to know too.

When is the best time to give reward for a certain action

from ml-agents.

tcmxx commented on August 16, 2024

@dora-gt It would be great if there is a lifecycle flowchart just like what Unity has.

from ml-agents.

vincentpierre commented on August 16, 2024

You are right, AcademyStep happens before AgentStep. The reward is a public float for all agents. If you want to wait until all agents have done their steps, you can have a script that waits until all agents have performed their steps and reward them accordingly. The rewards are collected at the beginning of the next frame, this mean that you can imagine a scenario in which the rewards are only given after all agents have acted. The best time to give a reward is as soon as possible. Still, you must be careful not override the reward value you assigned before it has been collected. I would recommend using reward += value rather than reward = value (the reward is reset to 0 after each step.
On another note, I am rather curious on how you managed to perform training with the internal brain.

from ml-agents.

tcmxx commented on August 16, 2024

@vincentpierre Thanks~ I've managed to modify the script to run those in the order I want.
To train the network using TensorflowSharp, I just need to give the train operation a name when build the tensorflow graph . Somthing like

optimizer = tf.train.MomentumOptimizer(learning_rate=lr,momentum = 0.9, use_nesterov=False) update_once = optimizer.minimize(self.loss,name = 'train_once')

And then you can run the operation by name in tensorflowsharp.
It is also possible to run the restore/save checkpoint operation the same way.

from ml-agents.

vincentpierre commented on August 16, 2024

@tcmxx I am trying t reproduce what you have (training in the editor) I have questions : In C#, do you import a "frozen" graph or just the graph_def without the weights ?

If I just use the metagraph, my variables are not initialized and I have not figured out how to initialize the variables in C#. [Edit : It seems I can run the global variable initializer from C#, I just need to include it in my graph in python}
If I use a frozen graph, I get the error Input 0 of node opt/update_output_0/kernel/ApplyAdam was passed float from output_0/kernel:0 incompatible with expected float_ref. which means that it cannot modify a constant tensor (I think it makes sense since my weights are frozen constants, they cannot change anymore).

Also, I do not know how to save and load checkpoints in C#, I have not uncovered everything TensorflowSharp can do.

Do you have code somewhere I could look at ? How you generate the tensorflow graph and how you call the training operator from C# ? This is something I think a lot of people would like to have as a feature, could you make a PR ?

from ml-agents.

tcmxx commented on August 16, 2024

@vincentpierre
Yes, you can create the init in python and then call it in c#. The save/load is the same. Use things like:
saver = tf.train.Saver()
to define a saver. Then there should be operation/tensors for saving/restorin/path.
you might get the names of those by following:

# The name of the tensor you must feed with a filename when saving/restoring.
 print(saver_def.filename_tensor_name)
 # The name of the target operation you must run when restoring.
print(saver_def.restore_op_name)
# The name of the target operation you must run when saving.
print(saver_def.save_tensor_name)

I dont have the code online yet. it is too messy. but I've forked your repo and am moving my old stuff to use you Unity ML. I can also put my old codes online if you want to take a look.

Btw, what would you like me to put in a PR?

from ml-agents.

vincentpierre commented on August 16, 2024

@tcmxx Thank you for your help, I was hopping there was a way to save and load the model in C#. I do not know what a checkpoint would look like if I called the saving operator from C# at runtime. I would be happy to look at your code or pieces of your code. I though you were working on a CoreBrainInternal in ML-Agents and that you were able to do training with the internal brain (so I wanted to know if you could make a PR of it or of an Example Environment), but I understand this work is in progress.

from ml-agents.

tcmxx commented on August 16, 2024

@vincentpierre The checkpoint would be just the same as if you call it from python. I am working on a cleaner version of a demo scene for my professor's class and I can I can show it to you and make a PR before next week.

from ml-agents.

mmattar commented on August 16, 2024

Folks, closing out this issue as it has been inactive for some time. We have since released ML-Agents v0.3, so try it out create a new issue if you run into similar problems.

from ml-agents.

lock commented on August 16, 2024

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

from ml-agents.

Order of Academy Step ad Agent Step about ml-agents HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent