Giter Site home page Giter Site logo

Comments (10)

dora-gt avatar dora-gt commented on August 16, 2024

Great question, I want to know too.

When is the best time to give reward for a certain action

from ml-agents.

tcmxx avatar tcmxx commented on August 16, 2024

@dora-gt It would be great if there is a lifecycle flowchart just like what Unity has.

from ml-agents.

vincentpierre avatar vincentpierre commented on August 16, 2024

You are right, AcademyStep happens before AgentStep. The reward is a public float for all agents. If you want to wait until all agents have done their steps, you can have a script that waits until all agents have performed their steps and reward them accordingly. The rewards are collected at the beginning of the next frame, this mean that you can imagine a scenario in which the rewards are only given after all agents have acted. The best time to give a reward is as soon as possible. Still, you must be careful not override the reward value you assigned before it has been collected. I would recommend using reward += value rather than reward = value (the reward is reset to 0 after each step.
On another note, I am rather curious on how you managed to perform training with the internal brain.

from ml-agents.

tcmxx avatar tcmxx commented on August 16, 2024

@vincentpierre Thanks~ I've managed to modify the script to run those in the order I want.
To train the network using TensorflowSharp, I just need to give the train operation a name when build the tensorflow graph . Somthing like

optimizer = tf.train.MomentumOptimizer(learning_rate=lr,momentum = 0.9, use_nesterov=False) update_once = optimizer.minimize(self.loss,name = 'train_once')

And then you can run the operation by name in tensorflowsharp.
It is also possible to run the restore/save checkpoint operation the same way.

from ml-agents.

vincentpierre avatar vincentpierre commented on August 16, 2024

@tcmxx I am trying t reproduce what you have (training in the editor) I have questions : In C#, do you import a "frozen" graph or just the graph_def without the weights ?

  • If I just use the metagraph, my variables are not initialized and I have not figured out how to initialize the variables in C#. [Edit : It seems I can run the global variable initializer from C#, I just need to include it in my graph in python}
  • If I use a frozen graph, I get the error Input 0 of node opt/update_output_0/kernel/ApplyAdam was passed float from output_0/kernel:0 incompatible with expected float_ref. which means that it cannot modify a constant tensor (I think it makes sense since my weights are frozen constants, they cannot change anymore).

Also, I do not know how to save and load checkpoints in C#, I have not uncovered everything TensorflowSharp can do.

Do you have code somewhere I could look at ? How you generate the tensorflow graph and how you call the training operator from C# ? This is something I think a lot of people would like to have as a feature, could you make a PR ?

from ml-agents.

tcmxx avatar tcmxx commented on August 16, 2024

@vincentpierre
Yes, you can create the init in python and then call it in c#. The save/load is the same. Use things like:
saver = tf.train.Saver()
to define a saver. Then there should be operation/tensors for saving/restorin/path.
you might get the names of those by following:

# The name of the tensor you must feed with a filename when saving/restoring.
 print(saver_def.filename_tensor_name)
 # The name of the target operation you must run when restoring.
print(saver_def.restore_op_name)
# The name of the target operation you must run when saving.
print(saver_def.save_tensor_name)

I dont have the code online yet. it is too messy. but I've forked your repo and am moving my old stuff to use you Unity ML. I can also put my old codes online if you want to take a look.

Btw, what would you like me to put in a PR?

from ml-agents.

vincentpierre avatar vincentpierre commented on August 16, 2024

@tcmxx Thank you for your help, I was hopping there was a way to save and load the model in C#. I do not know what a checkpoint would look like if I called the saving operator from C# at runtime. I would be happy to look at your code or pieces of your code. I though you were working on a CoreBrainInternal in ML-Agents and that you were able to do training with the internal brain (so I wanted to know if you could make a PR of it or of an Example Environment), but I understand this work is in progress.

from ml-agents.

tcmxx avatar tcmxx commented on August 16, 2024

@vincentpierre The checkpoint would be just the same as if you call it from python. I am working on a cleaner version of a demo scene for my professor's class and I can I can show it to you and make a PR before next week.

from ml-agents.

mmattar avatar mmattar commented on August 16, 2024

Folks, closing out this issue as it has been inactive for some time. We have since released ML-Agents v0.3, so try it out create a new issue if you run into similar problems.

from ml-agents.

lock avatar lock commented on August 16, 2024

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

from ml-agents.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.