unity-technologies / ml-agents Goto Github PK

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.

Home Page: https://unity.com/products/machine-learning-agents

License: Other

Python 40.39% C# 54.53% Dockerfile 0.04% Batchfile 0.05% ShaderLab 0.24% Shell 0.06% C 0.01% Jupyter Notebook 4.68%

deep-learning deep-reinforcement-learning machine-learning neural-networks reinforcement-learning unity unity3d

ml-agents's Introduction

Unity ML-Agents Toolkit

(latest release) (all releases)

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents. We provide implementations (based on PyTorch) of state-of-the-art algorithms to enable game developers and hobbyists to easily train intelligent agents for 2D, 3D and VR/AR games. Researchers can also use the provided simple-to-use Python API to train Agents using reinforcement learning, imitation learning, neuroevolution, or any other methods. These trained agents can be used for multiple purposes, including controlling NPC behavior (in a variety of settings such as multi-agent and adversarial), automated testing of game builds and evaluating different game design decisions pre-release. The ML-Agents Toolkit is mutually beneficial for both game developers and AI researchers as it provides a central platform where advances in AI can be evaluated on Unity’s rich environments and then made accessible to the wider research and game developer communities.

Features

17+ example Unity environments
Support for multiple environment configurations and training scenarios
Flexible Unity SDK that can be integrated into your game or custom Unity scene
Support for training single-agent, multi-agent cooperative, and multi-agent competitive scenarios via several Deep Reinforcement Learning algorithms (PPO, SAC, MA-POCA, self-play).
Support for learning from demonstrations through two Imitation Learning algorithms (BC and GAIL).
Quickly and easily add your own custom training algorithm and/or components.
Easily definable Curriculum Learning scenarios for complex tasks
Train robust agents using environment randomization
Flexible agent control with On Demand Decision Making
Train using multiple concurrent Unity environment instances
Utilizes the Sentis to provide native cross-platform support
Unity environment control from Python
Wrap Unity learning environments as a gym environment
Wrap Unity learning environments as a PettingZoo environment

See our ML-Agents Overview page for detailed descriptions of all these features. Or go straight to our web docs.

Releases & Documentation

Our latest, stable release is Release 21. Click here to get started with the latest release of ML-Agents.

You can also check out our new web docs!

The table below lists all our releases, including our main branch which is under active development and may be unstable. A few helpful guidelines:

The Versioning page overviews how we manage our GitHub releases and the versioning process for each of the ML-Agents components.
The Releases page contains details of the changes between releases.
The Migration page contains details on how to upgrade from earlier releases of the ML-Agents Toolkit.
The Documentation links in the table below include installation and usage instructions specific to each release. Remember to always use the documentation that corresponds to the release version you're using.
The com.unity.ml-agents package is verified for Unity 2020.1 and later. Verified packages releases are numbered 1.0.x.

Version	Release Date	Source	Documentation	Download	Python Package	Unity Package
develop (unstable)	--	source	docs	download	--	--
Release 21	October 9, 2023	source	docs	download	1.0.0	3.0.0

If you are a researcher interested in a discussion of Unity as an AI platform, see a pre-print of our reference paper on Unity and the ML-Agents Toolkit.

If you use Unity or the ML-Agents Toolkit to conduct research, we ask that you cite the following paper as a reference:

@article{juliani2020,
  title={Unity: A general platform for intelligent agents},
  author={Juliani, Arthur and Berges, Vincent-Pierre and Teng, Ervin and Cohen, Andrew and Harper, Jonathan and Elion, Chris and Goy, Chris and Gao, Yuan and Henry, Hunter and Mattar, Marwan and Lange, Danny},
  journal={arXiv preprint arXiv:1809.02627},
  url={https://arxiv.org/pdf/1809.02627.pdf},
  year={2020}
}

Additionally, if you use the MA-POCA trainer in your research, we ask that you cite the following paper as a reference:

@article{cohen2022,
  title={On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning},
  author={Cohen, Andrew and Teng, Ervin and Berges, Vincent-Pierre and Dong, Ruo-Ping and Henry, Hunter and Mattar, Marwan and Zook, Alexander and Ganguly, Sujoy},
  journal={RL in Games Workshop AAAI 2022},
  url={http://aaai-rlg.mlanctot.info/papers/AAAI22-RLG_paper_32.pdf},
  year={2022}
}

Additional Resources

We have a Unity Learn course, ML-Agents: Hummingbirds, that provides a gentle introduction to Unity and the ML-Agents Toolkit.

We've also partnered with CodeMonkeyUnity to create a series of tutorial videos on how to implement and use the ML-Agents Toolkit.

We have also published a series of blog posts that are relevant for ML-Agents:

(July 12, 2021) ML-Agents plays Dodgeball
(May 5, 2021) ML-Agents v2.0 release: Now supports training complex cooperative behaviors
(December 28, 2020) Happy holidays from the Unity ML-Agents team!
(November 20, 2020) How Eidos-Montréal created Grid Sensors to improve observations for training agents
(November 11, 2020) 2020 AI@Unity interns shoutout
(May 12, 2020) Announcing ML-Agents Unity Package v1.0!
(February 28, 2020) Training intelligent adversaries using self-play with ML-Agents
(November 11, 2019) Training your agents 7 times faster with ML-Agents
(October 21, 2019) The AI@Unity interns help shape the world
(April 15, 2019) Unity ML-Agents Toolkit v0.8: Faster training on real games
(March 1, 2019) Unity ML-Agents Toolkit v0.7: A leap towards cross-platform inference
(December 17, 2018) ML-Agents Toolkit v0.6: Improved usability of Brains and Imitation Learning
(October 2, 2018) Puppo, The Corgi: Cuteness Overload with the Unity ML-Agents Toolkit
(September 11, 2018) ML-Agents Toolkit v0.5, new resources for AI researchers available now
(June 26, 2018) Solving sparse-reward tasks with Curiosity
(June 19, 2018) Unity ML-Agents Toolkit v0.4 and Udacity Deep Reinforcement Learning Nanodegree
(May 24, 2018) Imitation Learning in Unity: The Workflow
(March 15, 2018) ML-Agents Toolkit v0.3 Beta released: Imitation Learning, feedback-driven features, and more
(December 11, 2017) Using Machine Learning Agents in a real game: a beginner’s guide
(December 8, 2017) Introducing ML-Agents Toolkit v0.2: Curriculum Learning, new environments, and more
(September 19, 2017) Introducing: Unity Machine Learning Agents Toolkit
Overviewing reinforcement learning concepts (multi-armed bandit and Q-learning)

Community and Feedback

The ML-Agents Toolkit is an open-source project and we encourage and welcome contributions. If you wish to contribute, be sure to review our contribution guidelines and code of conduct.

For problems with the installation and setup of the ML-Agents Toolkit, or discussions about how to best setup or train your agents, please create a new thread on the Unity ML-Agents forum and make sure to include as much detail as possible. If you run into any other problems using the ML-Agents Toolkit or have a specific feature request, please submit a GitHub issue.

Please tell us which samples you would like to see shipped with the ML-Agents Unity package by replying to this forum thread.

Your opinion matters a great deal to us. Only by hearing your thoughts on the Unity ML-Agents Toolkit can we continue to improve and grow. Please take a few minutes to let us know about it.

For any other questions or feedback, connect directly with the ML-Agents team at [email protected].

Privacy

In order to improve the developer experience for Unity ML-Agents Toolkit, we have added in-editor analytics. Please refer to "Information that is passively collected by Unity" in the Unity Privacy Policy.

ml-agents's People

Contributors

Stargazers

Watchers

Forkers

marekmarchlewicz thekolapo daerduocarey codehart vpmanske rcrowder wjpeters praneetdutta jzito hackathorn codeaudit ornitorrincco sterlingcrispin kaka1977 ashwal erwincoumans ferdinan32 u3dc czaoth little1tow cnheider amshb001 guanlongtianzi anktplwl91 conan79 allensmile dennisgt mykonata nicholasballard dl-yc mocapnctu tkxcrf benjamesbabala dreadlord1984 alistair-i-mclean orchestor b2220333 eharpste macauleym jkloop45 stevenlol sakuragiyoshimasa spk921 frankatmech craftsliu alpslee princetrunks imliangtian williamd4112 moinhasan allenchen0958 ml-lab nightraider vodelerk shehzaan soylentgraham guntler danzeeeman robertocaldas tramper2 etrepum emperor1412 specialsal jianyunbao swhite215 lulllabs ojjsaw leosheldon dasjack nirvana2 ardrew-zhang huiyi1990 plumpmath maomao110 zerojuls juanjsar q360344070 trulyspinach mplantady adozenlines coqns dracolytch myownclone nschuc onixus74 thierryah wbigoljr bmjoy e-gs undadedigit codercodercoder jojohello bboyyuan w1368027790 dora-gt suntabu yizhifangyuan vamsirajendra martinbischoff sirlpc

ml-agents's Issues

AttributeError: module 'signal' has no attribute 'SIGALRM'

I get the following error when running the Jupyter Notebook step.
This happens in PPO and Basic versions.
I have windows 10 and latest unity 2017.2.0b6 Personal

The Game opens up and the balls drop, but the tables do not move.
The brain is set to external.

I build the unity environment and put the .exe here : ....\ml-agents-master\python

ppo.py
setup.py
3DBall.exe
3DBall_data

here is what I updated in the Jupyter Notebook

General parameters

max_steps = 10000 # Set maximum number of steps to run environment.
run_path = "ppo" # The sub-directory name for model and summary statistics
load_model = False # Whether to load a saved model.
train_model = True # Whether to train the model.
summary_freq = 10000 # Frequency at which to save training statistics.
save_freq = 50000 # Frequency at which to save model.
env_name = "3DBall" # Name of the training environment file.

ERROR MESSAGE

Load the environment
In [3]:

env = UnityEnvironment(file_name=env_name)
print(str(env))
brain_name = env.brain_names[0]

AttributeError Traceback (most recent call last)
in ()
----> 1 env = UnityEnvironment(file_name=env_name)
2 print(str(env))
3 brain_name = env.brain_names[0]

C:\Users\mpliszka\Documents\UnityProjects\ml-agents-master\python\unityagents\environment.py in init(self, file_name, worker_id, base_port)
86 str(file_name)))
87
---> 88 old_handler = signal.signal(signal.SIGALRM, timeout_handler)
89 signal.alarm(30) # trigger alarm in x seconds
90 try:

AttributeError: module 'signal' has no attribute 'SIGALRM'

Passing Agent to Decision

Currently, there is no way to know for which agent Decision.Decide or Decision.MakeMemory is called. Knowing it may help in some cases. For example, when agent's state is a large vector for training neural network and can not be easily used to make decision. Agent itself may contain some sort of high level API, which greatly simplifies decider's task.

PPO: JSONDecodeError

JSONDecodeError Traceback (most recent call last)
in ()
37 info = env.reset(train_mode=train_model)[brain_name]
38 # Decide and take an action
---> 39 new_info = trainer.take_action(info, env, brain_name)
40 info = new_info
41 trainer.process_experiences(info, time_horizon, gamma, lambd)

D:\Unity Projects\ml-agents-master\python\ppo\trainer.py in take_action(self, info, env, brain_name)
51 self.stats['value_estimate'].append(value)
52 self.stats['entropy'].append(ent)
---> 53 new_info = env.step(actions, value={brain_name: value})[brain_name]
54 self.add_experiences(info, new_info, epsi, actions, a_dist, value)
55 return new_info

D:\Unity Projects\ml-agents-master\python\unityagents\environment.py in step(self, action, memory, value)
336 self._conn.send(b"STEP")
337 self._send_action(action, memory, value)
--> 338 return self._get_state()
339 elif not self._loaded:
340 raise UnityEnvironmentException("No Unity environment is loaded.")

D:\Unity Projects\ml-agents-master\python\unityagents\environment.py in _get_state(self)
204 self._data = {}
205 for index in range(self._num_brains):
--> 206 state_dict = self._get_state_dict()
207 b = state_dict["brain_name"]
208 n_agent = len(state_dict["agents"])

D:\Unity Projects\ml-agents-master\python\unityagents\environment.py in _get_state_dict(self)
171 state = self._conn.recv(self._buffer_size).decode('utf-8')
172 self._conn.send(b"RECEIVED")
--> 173 state_dict = json.loads(state)
174 return state_dict
175

d:\program files\python36\lib\json_init_.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352 parse_int is None and parse_float is None and
353 parse_constant is None and object_pairs_hook is None and not kw):
--> 354 return _default_decoder.decode(s)
355 if cls is None:
356 cls = JSONDecoder

d:\program files\python36\lib\json\decoder.py in decode(self, s, _w)
337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):

d:\program files\python36\lib\json\decoder.py in raw_decode(self, s, idx)
353 """
354 try:
--> 355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
357 raise JSONDecodeError("Expecting value", s, err.value) from None

JSONDecodeError: Expecting ',' delimiter: line 98 column 13 (char 1460)

cant understand some docs

https://github.com/Unity-Technologies/ml-agents/wiki/Getting-Started-with-Balance-Ball#testing-python-api

I follow the doc's step and get here:
I can get "http://localhost:8888/tree" successfully
BUT i dont know how to next or understand what the doc says.
(English is not my mother tongue)

Type of brain

Can't find "internal"

Default hyperparameters create confusion in "Getting Started with Balance Ball"

When following the "Getting Started with Balance Ball" tutorial the hyperparameters max_steps and summary_freq in PPO.ipynb are equal. This leads to tensorboard showing only a single point. Which may have confused, uhh, some of us, that something was broken :)

Problem with new environment

I created a new environment as per the instructions given. Created the build file and tried to run the PPO file. But for some reason it is stuck in the line env.reset(train_mode=train_model). Specifically it is stuck at this statement inside environment.py (line 172)
state = self._conn.recv(self._buffer_size).decode('utf-8')

PPO file was changed to give some random actions at each step. I checked the same for 3D Ball and it is working. I am unable to debug this issue. Please provide some help.

AttributeError

I am getting an AttributeError while running the 3rd cell of the PPO jupyter notebook. Not sure what is causing this. Please help.

Allow to pass absolute path in UnityEnvironment constructor

Currently, if using Jupyter notebook to create UnityEnvironment, game.exe file and game_Data folder should be copied to the folder where notebook is located. It would be nice to pass absolute file system path to constructor.

Python client freezes if reset() called instantly after creating UnityEnvironment

from unityagents import UnityEnvironment
env = UnityEnvironment(file_name='3DBall')
env.reset()

This code freezes python client 4 of 5 times. If env.reset() is called with some delay (~2 seconds), everything will work as expected. This may happen because python client starting communication before game is fully loaded.
Tested on Python 3.6 / Jupyter / Windows 10 x64.

Configure for real-time object recognition

Is there a possibility to configure this as to create an Augmented Reality application with real-time object recognition??

IndexOutOfRangeException: Index was outside the bounds of the array.

While running the 3DBall example, I am getting IndexOutOfBounds exception for the Platform agent. Here's the snippet for the error

I thought changing the value of Action Size parameter in Inspector window to 8 might solve the issue, but that just gives me another error

Small correction in "Getting Started with Balance Ball" wiki page

The second point of section Embedding the trained model itself must show this destination directory
unity-environment/ML-Agents/Assets/Resources/TensorflowModels/ intead of
unity-environment/ML-Agents/Resources/TensorflowModels/.

Regards

Need some help with 3DBall example

Setting up dependencies and building 3DBall example for the first time using readme ...

While running the PPO notebook "load the environment" step, there is a timeout. Have tried passing a different workerid and port but that doesn't help. Running all this on Windows 10.

Could someone figure out what went wrong? I pretty much followed instructions in the readme and didn't get fancy with anything.

documentation + use clarification

I'm trying to implement my own agent and environment and I'm a little unclear about a few things:

The implementations of CollectState() for Ball3DAgent and TennisAgent both seem to be roughly normalizing the floats they are returning. Is this a requirement, or just a best practice?

What's the range of values that can be expected from the float array act in AgentStep ?

MemorySize argument in the Brain, do you define it in the same way you do with the State and Actions ? If so where? Is that only for Heuristic brains?

Some rewards are missing when using frame skip.

When Academy.frameToSkip > 0 only rewards from non-skipped frames are sent to python. Instead rewards of skipped frames should be summed with last reward. Current behaviour may make some games unbeatable.
For example, imagine if Montezuma's Revenge would be replicated in unity. Rewards in this game are very sparse, like 999 of 1000 frames have zero reward. Using current implementation frame skip, non-zero reward will be skipped most of time, and AI won't be able to beat the game at all.

Windows : No connection could be made because the target machine actively refused it.

On Windows 10 : When trying to run the 3DBall environment in the Jupyter Notebook I get this exception (showed in the game application when using a debug-enabled .exe). The Jupyter notebook then times out because of this


SocketException: No connection could be made because the target machine actively refused it.

  at System.Net.Sockets.Socket.Connect (System.Net.IPAddress[] addresses, System.Int32 port) [0x0011a] in <f044fe2c9e7e4b8e91984b41f0cf0b04>:0 
  at System.Net.Sockets.Socket.Connect (System.String host, System.Int32 port) [0x00007] in <f044fe2c9e7e4b8e91984b41f0cf0b04>:0 
  at ExternalCommunicator.InitializeCommunicator () [0x0004a] in C:\Users\XXXX\Git\ml-agents\unity-environment\Assets\ML-Agents\Scripts\ExternalCommunicator.cs:98

I turned off Windows firewall so unsure what could cause this still? Do I need to run something as Administrator?

Issue with setting env_name (syntax error)

Upon setting everything up, setting the env_name seems to have an issue in my build:

File "", line 11
We can reset the environment to be provided with an initial set of observations and states for all the agents within the environment. In ML-Agents, states refer to a vector of variables corresponding to relevant aspects of the environment for an agent. Likewise, observations refer to a set of relevant pixel-wise visuals for an agent.
^
SyntaxError: invalid syntax

The env_name is set as follows:

env_name = "3DBall"
train_mode = True

Also, there are no import errors and the exported .app is within the Python folder.

Thanks!

No matching distribution found for tensorflow>=1.0 (from unityagents==0.1.1)

Hey there,

im trying to install all dependencies on windows via Git bash and pip but im receiving the following message instead:

No matching distribution found for tensorflow>=1.0 (from unityagents==0.1.1)

Any idea how i can fix this?

Should amend README not to add large files to git

I installed ML-AgentsWithPlugin.unitypackage and added it to my git, but the package contained large file that exceeded 100MB and then I became not able to push it.

We should amend README not to add such large files to git and to use git LFS.

Add Learning Rate Annealing to PPO

Current implementation of PPO uses fixed learning rate for duration of training process. This can produce degenerate models later in training, when a smaller learning rate is necessary.

Learning rate should be annealed over time to 0.

TFException: Shape [-1,24] has negative dimensions

I was running a training session in the PPO notebook and stopped it , then exported the graph

INFO:tensorflow:Restoring parameters from ./models/ppo/model-400000.cptk
Converted 4 variables to const ops.
20 ops in the final graph.

but when I use this .bytes file in Unity I get

TFException: Shape [-1,24] has negative dimensions [[Node: epsilon = Placeholder[dtype=DT_FLOAT, shape=[?,24], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] TensorFlow.TFStatus.CheckMaybeRaise (TensorFlow.TFStatus incomingStatus, System.Boolean last) (at <6ed6db22f8874deba74ffe3e566039be>:0) TensorFlow.TFSession.Run (TensorFlow.TFOutput[] inputs, TensorFlow.TFTensor[] inputValues, TensorFlow.TFOutput[] outputs, TensorFlow.TFOperation[] targetOpers, TensorFlow.TFBuffer runMetadata, TensorFlow.TFBuffer runOptions, TensorFlow.TFStatus status) (at <6ed6db22f8874deba74ffe3e566039be>:0) TensorFlow.TFSession+Runner.Run (TensorFlow.TFStatus status) (at <6ed6db22f8874deba74ffe3e566039be>:0) CoreBrainInternal.DecideAction () (at Assets/ML-Agents/Scripts/CoreBrainInternal.cs:244) Brain.DecideAction () (at Assets/ML-Agents/Scripts/Brain.cs:308) Academy.DecideAction () (at Assets/ML-Agents/Scripts/Academy.cs:250) Academy.RunMdp () (a

not sure why this has a -1 value, I've stopped training prematurely on the Ball3D demo without any problems, anyone have any ideas?

Documentation: Link to Simultaneous Single-Agent video should be public

On the the wiki home page, the link to this video is on a google drive, which I can't access. Please place in a public resource, such as youtube.

UnityEnvironmentException: The Unity environment took too long to respond

Hi,

I am trying to run the ML Agent 3D Ball example project but I am receiving an error in Jupyter during the "Load the Environment" cell. The Unity application opens up and the balls falls down on the plates and rolls down. The application runs for e.g. 30 seconds or so until it closes, and the following error can be seen in in Jupyter:

"UnityEnvironmentException: The Unity environment took too long to respond. Make sure environment does not need user interaction to launch and that the Academy and the external Brain(s) are attached to objects in the Scene."

The setup I am using is:

Unity 2017.1.1f1
Mac OSX Sierra 10.12.6
Tensorflow 1.3
Anaconda 4.3.27
Python 3.6

The following trace was printed:

`---------------------------------------------------------------------------
timeout Traceback (most recent call last)
~/Unity Projects/ml-agents/python/unityagents/environment.py in init(self, file_name, worker_id, base_port)
84 self._socket.listen(1)
---> 85 self._conn, _ = self._socket.accept()
86 self._conn.setblocking(1)

~/anaconda3/lib/python3.6/socket.py in accept(self)
204 """
--> 205 fd, addr = self._accept()
206 # If our type has the SOCK_NONBLOCK flag, we shouldn't pass it onto the

timeout: timed out

During handling of the above exception, another exception occurred:

UnityEnvironmentException Traceback (most recent call last)
in ()
----> 1 env = UnityEnvironment(file_name=env_name)
2 print(str(env))
3 brain_name = env.brain_names[0]

~/Unity Projects/ml-agents/python/unityagents/environment.py in init(self, file_name, worker_id, base_port)
91 "The Unity environment took too long to respond. Make sure {} does not need user interaction to launch "
92 "and that the Academy and the external Brain(s) are attached to objects in the Scene.".format(
---> 93 str(file_name)))
94 except UnityEnvironmentException:
95 proc1.kill()

UnityEnvironmentException: The Unity environment took too long to respond. Make sure environment does not need user interaction to launch and that the Academy and the external Brain(s) are attached to objects in the Scene.`

AttributeError: module 'signal' has no attribute 'SIGALRM'

On Windows, signal() can only be called with SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, or SIGTERM. A ValueError will be raised in any other case.

C:\Repos\UnityRF\ml-agents\python\unityagents\environment.py in __init__(self, file_name, worker_id, base_port)
     86                     str(file_name)))
     87 
---> 88         old_handler = signal.signal(signal.SIGALRM, timeout_handler)
     89         signal.alarm(30)  # trigger alarm in x seconds
     90         try:

AttributeError: module 'signal' has no attribute 'SIGALRM'

Unity environment freezing on launch

I've got to the point that I can successfully launch the 3D ball example. However, when I run the jupyter notebook file and the app starts automatically, it starts a tiny window and then freezes, and has to be process killed (see image below). Jupyter notebook doesn't show any errors, so I'm not entirely sure what's up. Any ideas as to how to fix this?

Thank you!

Order of Academy Step ad Agent Step

Hello,
I am a little confused about the order of you calling those functions internally. It looks like AcademyStep is called before AgentStep, and the rewards are collected after Agent step and before next AcademyStep, am i right?in this case we have to know the reward of each agent right after its step which might not be possible sometimes not possible. Correct me if I am wrong.

Also, do you have plan to support internal brain's training? I've done it somehow, but just want to know if you will provide better API.

Thank you!

How do i create periodic reward

I am building an agent that periodically hits the ball.

If the ball is not hit for a period longer than 2 sec, i would add a negative reward.
So i've added a float that tracks time

private float m_time_since_last_hit = 0f;

void Update()
{

   m_time_since_last_hit += Time.deltaTime;

   if ( m_time_since_last_hit >2 )
   {
     m_agent.reward = -0.1f;
     m_time_since_last_hit = 0f;    
   }

}

void RegisterHit()
{
    m_agent.reward = 1f;
    m_time_since_last_hit = 0f;
}

But during a simulation, where the time scale is up to a 100, 2 seconds can equal to over 3 minutes of simulated time.

How to i account for the time scale? Can i use m_time_since_last_hit += Time.deltaTime * Time.timeScale; or is scaling of time done differently?
Should i pass m_time_since_last_hit to CollectState() ?

pip3 install PermissionError: [Errno 13] Permission denied: '/usr/local/bin/jupyter'

The docs mention: pip install . but it gives an error here (Mac OSX).
pip install PermissionError: [Errno 13] Permission denied: '/usr/local/bin/jupyter'

pip install . --user solves it (better than sudo pip install .), perhaps update the doc or add a note?

Change TimeScale from Python

Hey all, debugging my first custom environment. It'd be great if there were a fast/easy way to change the Academy's TimeScale from a setting in python, so that we could have things be really slow for debugging the ML interface, and then crank it up for actual training.

Anything I missed?

It happens when I click "Ball3DBrain"

Windows and Python API

Hi,

Will you be adding Python install dependencies for Windows (10).

Thanks

Python client freezes if exception is occured in game

Hi,
If exception is raised in C# code (at Agent.Step in my case), python client freezes at env.step() function. It would be nice to catch an exception, send it to python client, and raise it there. Or at least check for timeout in python client and raise an exception.
Curretnly timeout is disabled because of self._conn.setblocking(1) line. If it's replaced by self._conn.settimeout(10), timeout occurs as it should do.
Tested on Python 3.6 / Jupyter / Windows 10 x64.

[Unity-iOS] NotSupportException

I tried the iOS with TensorFlowSharp example, and got the following error when deploy to a mobile device. It works fine on Unity Editor.

Here is the Error Message:


NotSupportedException: To marshal a managed method, please add an attribute named 'MonoPInvokeCallback' to the method definition.
  at TensorFlow.TFTensor..ctor (System.Int32 value) [0x00000] in <00000000000000000000000000000000>:0 
  at TensorFlow.TFTensor.op_Implicit (System.Int32 value) [0x00000] in <00000000000000000000000000000000>:0 
  at ExampleCommon.ImageUtil.ConstructGraphToNormalizeImage (TensorFlow.TFGraph& graph, TensorFlow.TFOutput& input, TensorFlow.TFOutput& output, TensorFlow.TFDataType destinationDataType) [0x00000] in <00000000000000000000000000000000>:0 
  at ExampleCommon.ImageUtil.CreateTensorFromImageFile (System.String file, TensorFlow.TFDataType destinationDataType) [0x00000] in <00000000000000000000000000000000>:0 
  at TensorTest.Test () [0x00000] in <00000000000000000000000000000000>:0 
 
(Filename: currently not available on il2cpp Line: -1)

Setting up 1 worker threads for Enlighten.
  Thread -> id: 16f54b000 -> priority: 1

Here is my source code (grab from TensorFlowSharp Example)

using System.IO;
using UnityEngine;
using TensorFlow;

namespace ExampleCommon
{
	public static class ImageUtil
	{
		// Convert the image in filename to a Tensor suitable as input to the Inception model.
		public static TFTensor CreateTensorFromImageFile (string file, TFDataType destinationDataType = TFDataType.Float)
		{
			byte[] contents;
			if (Application.platform == RuntimePlatform.Android)
			{
				WWW reader = new WWW(file);
				while (!reader.isDone) { }

				contents = reader.bytes;
			} else {
				contents = File.ReadAllBytes(file);
			}

			// var contents = File.ReadAllBytes (file);

			// DecodeJpeg uses a scalar String-valued tensor as input.
			var tensor = TFTensor.CreateString (contents);

			TFGraph graph;
			TFOutput input, output;

			// Construct a graph to normalize the image
			ConstructGraphToNormalizeImage (out graph, out input, out output, destinationDataType);

			// Execute that graph to normalize this one image
			using (var session = new TFSession (graph)) {
				var normalized = session.Run (
						 inputs: new [] { input },
						 inputValues: new [] { tensor },
						 outputs: new [] { output });

				return normalized [0];
			}
		}

		// The inception model takes as input the image described by a Tensor in a very
		// specific normalized format (a particular image size, shape of the input tensor,
		// normalized pixel values etc.).
		//
		// This function constructs a graph of TensorFlow operations which takes as
		// input a JPEG-encoded string and returns a tensor suitable as input to the
		// inception model.
		private static void ConstructGraphToNormalizeImage (out TFGraph graph, out TFOutput input, out TFOutput output, TFDataType destinationDataType = TFDataType.Float)
		{
			// Some constants specific to the pre-trained model at:
			// https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip
			//
			// - The model was trained after with images scaled to 224x224 pixels.
			// - The colors, represented as R, G, B in 1-byte each were converted to
			//   float using (value - Mean)/Scale.

			const int W = 224;
			const int H = 224;
			const float Mean = 127.5f;
			const float Scale = 127.5f;

			graph = new TFGraph ();
			input = graph.Placeholder (TFDataType.String);

			output = graph.Cast (graph.Div (
				x: graph.Sub (
					x: graph.ResizeBilinear (
						images: graph.ExpandDims (
							input: graph.Cast (
								graph.DecodeJpeg (contents: input, channels: 3), DstT: TFDataType.Float),
							dim: graph.Const (0, "make_batch")),
						size: graph.Const (new int [] { W, H }, "size")),
					y: graph.Const (Mean, "mean")),
				y: graph.Const (Scale, "scale")), destinationDataType);
		}
	}
}

UnityEnvironmentException : Unity environment took too long to respond.

Hello, I am running 3d_ball example on Jupyter notebook on Windows 10 machine. The training function stops in between with the following exception.

Is GPU support?

I am wondering this C#tensorflow build is GPU support?

PPO ValueError: need at least one array to concatenate

I'm trying to train my own agent in an environment and getting this error , not sure what I'm misconfiguring in my scene for this to happen

`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in ()
42 if len(trainer.training_buffer['actions']) > buffer_size and train_model:
43 # Perform gradient descent with experience buffer
---> 44 trainer.update_model(batch_size, num_epoch)
45 if steps % summary_freq == 0 and steps != 0 and train_model:
46 # Write training statistics to tensorboard.

/Users/sterlingcrispin/code/Unity-ML/python/ppo/trainer.pyc in update_model(self, batch_size, num_epoch)
139 if self.is_continuous:
140 feed_dict[self.model.epsilon] = np.vstack(training_buffer['epsilons'][start:end])
--> 141 feed_dict[self.model.state_in] = np.vstack(training_buffer['states'][start:end])
142 else:
143 feed_dict[self.model.action_holder] = np.hstack(training_buffer['actions'][start:end])

/Users/sterlingcrispin/anaconda/lib/python2.7/site-packages/numpy/core/shape_base.pyc in vstack(tup)
235
236 """
--> 237 return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
238
239 def hstack(tup):

ValueError: need at least one array to concatenate`

Issues with Basic example

After train Basic agent with PPO.ipynb and sending back the bytes file to Unity, console shows this error:

UnityAgentsException: Expects arg[0] to be int32 but float is provided CoreBrainInternal.DecideAction () (at Assets/ML-Agents/Scripts/CoreBrainInternal.cs:240) Brain.DecideAction () (at Assets/ML-Agents/Scripts/Brain.cs:312) Academy.DecideAction () (at Assets/ML-Agents/Scripts/Academy.cs:250) Academy.RunMdp () (at Assets/ML-Agents/Scripts/Academy.cs:337) Academy.FixedUpdate () (at Assets/ML-Agents/Scripts/Academy.cs:260)

I have checked that the Action space type is Discrete, any ideas?

Try to use BasicDecision.cs return this error in console:
NullReferenceException: Object reference not set to an instance of an object BasicAgent.AgentStep (System.Single[] act) (at Assets/ML-Agents/Examples/Basic/Scripts/BasicAgent.cs:25) Agent.Step () (at Assets/ML-Agents/Scripts/Agent.cs:209) Brain.Step () (at Assets/ML-Agents/Scripts/Brain.cs:324) Academy.RunMdp () (at Assets/ML-Agents/Scripts/Academy.cs:346) Academy.FixedUpdate () (at Assets/ML-Agents/Scripts/Academy.cs:260)

Easy fix replacing line 9 with this:
return new float[1]{ 1f };

Regards

Custom ~Agent.cs error - Cannot reshape array

I must be overlooking something, anyone know what this is pointing to? :D

Maybe this works better...

It occurs no matter how many states I have within the code, and no matter what values I change in the state parameter within inspector.

Also, when states are set to 0 within inspector, there are no errors except for in game - Where it expects 8, which brings us back to the issue :( :

Possible to include the 'Frogger' sample project?

Would it be possible to include the frogger-like demo from https://blogs.unity3d.com/2017/09/19/introducing-unity-machine-learning-agents/ within the project?

Specifically this video: https://youtu.be/fiQsmdwEGT8

Thanks!

How do you design around variable amount of states?

How do you design your agent in environments with variable amount of states?

To give an example: Say you have an agent that has to get from location A to location B. In the environment there is a range of "enemies". The agents job is to get from A to B while being rewarded for being as far as possible from the enemies. There will also be obstacles on the path, which the agent have to maneuver around. The tricky part is: While going from A to B more enemies and obstacles will spawn.

My initial thought was to append the "state" list with the spawned enemies and obstacles, but if each index in the list does not represent the same type of state, how does the brain differentiate between an obstacles and an enemy?

How would you go about this? I am relatively new to machine learning so excuse me for my lack of lingo.

Add random projection neural networks and associative memory?

Or not because there has been no comparison done with "conventional" neural networks.
https://github.com/S6Regen/EvoNet
https://randomprojectionai.blogspot.com/

How an agent can change its brain?

In order to have a flexible cooperative architecture (e.g., Mergeable nervous systems for robots, Nature 2017), the ability to chain the brain, at any point, seems essential. I was wondering whether it is possible for the agents to change their brain at runtime? Also, is it possible to form a structure that can be changed at runtime for cooperative tasks?

Need help with multiple brains in one scene

I cant seem to get it working. I posted several questions regarding it and received several answers, but even tho I passed more and more "exceptions" I cant get it run.
I think its not in my power to ask and receive sufficient feedback to make it work, I hereby ask for some code guru to make a working example, that I can modify enough to suit me.

Questions:
Can there be an example scene with 2 working brains? <-- This is the most important step I am working towards and cant get it working. Please if you can code it post what needs to be made, I am desperate

Can there be an option to call train function for specific brain, and step different brains asynchronously?
I got a reply that its not possible to get more brains learned in one environment atm, but would love to see it happen. Can anyone give direction for this?

Unable to get tensorboard reporting to find event file on windows 10 - help request

I have been successful in executing the ppo notebook including importing a bytes file into unity and having it run both in the editor and in building a new standalone instance.

I am successful in having the tensorboard menu display at localhost:6060 but I am unable to get tensorboard to find my event files and to display results - on windows 10, Python3.

// this command from the help file causes an error
tensorboard -- logdir DIRECTORY_PATH --debug
// I can see event files in ./summaries and I have tried using the fully qualified both have localhost:6060 tell me that the files cannot be found.
tensorboard --logdir='./summaries'

Unable to set Graph Model in Balance Ball

I am following the instructions from https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Getting-Started-with-Balance-Ball.md. All is well until I want to incorporate the model into the Unity environment to see it play the game.

When I follow these instructions :

Change the Type of Brain to Internal.
Drag the <env_name>.bytes file from the Project window of the Editor to the Graph Model placeholder in the 3DBallBrain inspector window.

I get stuck at setting the Graph Model. I simply do not see such a property in the Inspector. This is on MacOS Sierra running Unity 2017.1.1f1

What am I missing here?

Getting a numpy import error when numpy is installed

CoreBrainInternal throws error when trying to use an internal brain with observations.

This line (197) fails runner.AddInput(graph[graphScope + placeholder.name][0], new float[] { Random.Range(placeholder.minValue, placeholder.maxValue) }).

I grabbed the latest script and it now throws an exception but I've confirmed that epsilon is set on the brain. Debug.Log shows that graph is failing to return anything, the other vars have values (except graphScope, could that be the problem?).

I also noticed when trying to debug that the GridWorld example does not have an internal brain option, related?

Brain screenshot attached.

documentation - Observing Training Progress

for some reason the following command does not work in windows 10
tensorboard --logdir='./summaries'

You get "No scalar data was found." on localhost:6006

the following works
tensorboard --logdir=./summaries
or
tensorboard --logdir summaries

SIGALRM not present in windows

How do you run the model in ubuntu and unity in windows? SIGALARM signal is not present in windows. After I disabled the 30 second signal for raising execption, it still doesn't seem to be working. In Jupyter, it seems to not leave control, however, The build seem to load.
Bellow is the line,
env = UnityEnvironment(file_name=env_name)

unity-technologies / ml-agents Goto Github PK

ml-agents's Introduction

Unity ML-Agents Toolkit

Features

Releases & Documentation

Additional Resources

More from Unity

Community and Feedback

Privacy

ml-agents's People

Contributors

Stargazers

Watchers

Forkers

ml-agents's Issues

ppo.py setup.py 3DBall.exe 3DBall_data

General parameters

ERROR MESSAGE

env = UnityEnvironment(file_name=env_name) print(str(env)) brain_name = env.brain_names[0]

Recommend Projects

Recommend Topics

Recommend Org

ppo.py
setup.py
3DBall.exe
3DBall_data

env = UnityEnvironment(file_name=env_name)
print(str(env))
brain_name = env.brain_names[0]