Giter Site home page Giter Site logo

unity-technologies / ml-agents Goto Github PK

View Code? Open in Web Editor NEW
16.4K 551.0 4.1K 2.9 GB

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.

Home Page: https://unity.com/products/machine-learning-agents

License: Other

Python 40.39% C# 54.53% Dockerfile 0.04% Batchfile 0.05% ShaderLab 0.24% Shell 0.06% C 0.01% Jupyter Notebook 4.68%
reinforcement-learning unity3d deep-learning unity deep-reinforcement-learning neural-networks machine-learning

ml-agents's Introduction

Unity ML-Agents Toolkit

docs badge

license badge

(latest release) (all releases)

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents. We provide implementations (based on PyTorch) of state-of-the-art algorithms to enable game developers and hobbyists to easily train intelligent agents for 2D, 3D and VR/AR games. Researchers can also use the provided simple-to-use Python API to train Agents using reinforcement learning, imitation learning, neuroevolution, or any other methods. These trained agents can be used for multiple purposes, including controlling NPC behavior (in a variety of settings such as multi-agent and adversarial), automated testing of game builds and evaluating different game design decisions pre-release. The ML-Agents Toolkit is mutually beneficial for both game developers and AI researchers as it provides a central platform where advances in AI can be evaluated on Unity’s rich environments and then made accessible to the wider research and game developer communities.

Features

  • 17+ example Unity environments
  • Support for multiple environment configurations and training scenarios
  • Flexible Unity SDK that can be integrated into your game or custom Unity scene
  • Support for training single-agent, multi-agent cooperative, and multi-agent competitive scenarios via several Deep Reinforcement Learning algorithms (PPO, SAC, MA-POCA, self-play).
  • Support for learning from demonstrations through two Imitation Learning algorithms (BC and GAIL).
  • Quickly and easily add your own custom training algorithm and/or components.
  • Easily definable Curriculum Learning scenarios for complex tasks
  • Train robust agents using environment randomization
  • Flexible agent control with On Demand Decision Making
  • Train using multiple concurrent Unity environment instances
  • Utilizes the Sentis to provide native cross-platform support
  • Unity environment control from Python
  • Wrap Unity learning environments as a gym environment
  • Wrap Unity learning environments as a PettingZoo environment

See our ML-Agents Overview page for detailed descriptions of all these features. Or go straight to our web docs.

Releases & Documentation

Our latest, stable release is Release 21. Click here to get started with the latest release of ML-Agents.

You can also check out our new web docs!

The table below lists all our releases, including our main branch which is under active development and may be unstable. A few helpful guidelines:

  • The Versioning page overviews how we manage our GitHub releases and the versioning process for each of the ML-Agents components.
  • The Releases page contains details of the changes between releases.
  • The Migration page contains details on how to upgrade from earlier releases of the ML-Agents Toolkit.
  • The Documentation links in the table below include installation and usage instructions specific to each release. Remember to always use the documentation that corresponds to the release version you're using.
  • The com.unity.ml-agents package is verified for Unity 2020.1 and later. Verified packages releases are numbered 1.0.x.
Version Release Date Source Documentation Download Python Package Unity Package
develop (unstable) -- source docs download -- --
Release 21 October 9, 2023 source docs download 1.0.0 3.0.0

If you are a researcher interested in a discussion of Unity as an AI platform, see a pre-print of our reference paper on Unity and the ML-Agents Toolkit.

If you use Unity or the ML-Agents Toolkit to conduct research, we ask that you cite the following paper as a reference:

@article{juliani2020,
  title={Unity: A general platform for intelligent agents},
  author={Juliani, Arthur and Berges, Vincent-Pierre and Teng, Ervin and Cohen, Andrew and Harper, Jonathan and Elion, Chris and Goy, Chris and Gao, Yuan and Henry, Hunter and Mattar, Marwan and Lange, Danny},
  journal={arXiv preprint arXiv:1809.02627},
  url={https://arxiv.org/pdf/1809.02627.pdf},
  year={2020}
}

Additionally, if you use the MA-POCA trainer in your research, we ask that you cite the following paper as a reference:

@article{cohen2022,
  title={On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning},
  author={Cohen, Andrew and Teng, Ervin and Berges, Vincent-Pierre and Dong, Ruo-Ping and Henry, Hunter and Mattar, Marwan and Zook, Alexander and Ganguly, Sujoy},
  journal={RL in Games Workshop AAAI 2022},
  url={http://aaai-rlg.mlanctot.info/papers/AAAI22-RLG_paper_32.pdf},
  year={2022}
}

Additional Resources

We have a Unity Learn course, ML-Agents: Hummingbirds, that provides a gentle introduction to Unity and the ML-Agents Toolkit.

We've also partnered with CodeMonkeyUnity to create a series of tutorial videos on how to implement and use the ML-Agents Toolkit.

We have also published a series of blog posts that are relevant for ML-Agents:

More from Unity

Community and Feedback

The ML-Agents Toolkit is an open-source project and we encourage and welcome contributions. If you wish to contribute, be sure to review our contribution guidelines and code of conduct.

For problems with the installation and setup of the ML-Agents Toolkit, or discussions about how to best setup or train your agents, please create a new thread on the Unity ML-Agents forum and make sure to include as much detail as possible. If you run into any other problems using the ML-Agents Toolkit or have a specific feature request, please submit a GitHub issue.

Please tell us which samples you would like to see shipped with the ML-Agents Unity package by replying to this forum thread.

Your opinion matters a great deal to us. Only by hearing your thoughts on the Unity ML-Agents Toolkit can we continue to improve and grow. Please take a few minutes to let us know about it.

For any other questions or feedback, connect directly with the ML-Agents team at [email protected].

Privacy

In order to improve the developer experience for Unity ML-Agents Toolkit, we have added in-editor analytics. Please refer to "Information that is passively collected by Unity" in the Unity Privacy Policy.

ml-agents's People

Contributors

acelisweaven avatar alex-mccarthy-unity avatar alphonsocrawford avatar andersonaddo avatar andrewcoh avatar anupambhatnagar avatar awjuliani avatar brccabral avatar chriselion avatar christiancoenen avatar dongruoping avatar eltronix avatar ervteng avatar eshvk avatar harper-u3d avatar hunter-unity avatar hvpeteet avatar jo3w4rd avatar jrupert-unity avatar mantasp avatar maryamhonari avatar miguelalonsojr avatar rsfutch77 avatar runswimflyrich avatar sankalp04 avatar shihzy avatar surfnerd avatar vincentpierre avatar xcao65 avatar xiaomaogy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ml-agents's Issues

Change TimeScale from Python

Hey all, debugging my first custom environment. It'd be great if there were a fast/easy way to change the Academy's TimeScale from a setting in python, so that we could have things be really slow for debugging the ML interface, and then crank it up for actual training.

UnityEnvironmentException: The Unity environment took too long to respond

Hi,

I am trying to run the ML Agent 3D Ball example project but I am receiving an error in Jupyter during the "Load the Environment" cell. The Unity application opens up and the balls falls down on the plates and rolls down. The application runs for e.g. 30 seconds or so until it closes, and the following error can be seen in in Jupyter:

"UnityEnvironmentException: The Unity environment took too long to respond. Make sure environment does not need user interaction to launch and that the Academy and the external Brain(s) are attached to objects in the Scene."

The setup I am using is:

  • Unity 2017.1.1f1
  • Mac OSX Sierra 10.12.6
  • Tensorflow 1.3
  • Anaconda 4.3.27
  • Python 3.6

The following trace was printed:

`---------------------------------------------------------------------------
timeout Traceback (most recent call last)
~/Unity Projects/ml-agents/python/unityagents/environment.py in init(self, file_name, worker_id, base_port)
84 self._socket.listen(1)
---> 85 self._conn, _ = self._socket.accept()
86 self._conn.setblocking(1)

~/anaconda3/lib/python3.6/socket.py in accept(self)
204 """
--> 205 fd, addr = self._accept()
206 # If our type has the SOCK_NONBLOCK flag, we shouldn't pass it onto the

timeout: timed out

During handling of the above exception, another exception occurred:

UnityEnvironmentException Traceback (most recent call last)
in ()
----> 1 env = UnityEnvironment(file_name=env_name)
2 print(str(env))
3 brain_name = env.brain_names[0]

~/Unity Projects/ml-agents/python/unityagents/environment.py in init(self, file_name, worker_id, base_port)
91 "The Unity environment took too long to respond. Make sure {} does not need user interaction to launch "
92 "and that the Academy and the external Brain(s) are attached to objects in the Scene.".format(
---> 93 str(file_name)))
94 except UnityEnvironmentException:
95 proc1.kill()

UnityEnvironmentException: The Unity environment took too long to respond. Make sure environment does not need user interaction to launch and that the Academy and the external Brain(s) are attached to objects in the Scene.`

Unable to set Graph Model in Balance Ball

I am following the instructions from https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Getting-Started-with-Balance-Ball.md. All is well until I want to incorporate the model into the Unity environment to see it play the game.

When I follow these instructions :

Change the Type of Brain to Internal.
Drag the <env_name>.bytes file from the Project window of the Editor to the Graph Model placeholder in the 3DBallBrain inspector window.

I get stuck at setting the Graph Model. I simply do not see such a property in the Inspector. This is on MacOS Sierra running Unity 2017.1.1f1

screen shot 2017-09-27 at 15 38 21

What am I missing here?

Need help with multiple brains in one scene

I cant seem to get it working. I posted several questions regarding it and received several answers, but even tho I passed more and more "exceptions" I cant get it run.
I think its not in my power to ask and receive sufficient feedback to make it work, I hereby ask for some code guru to make a working example, that I can modify enough to suit me.

Questions:
Can there be an example scene with 2 working brains? <-- This is the most important step I am working towards and cant get it working. Please if you can code it post what needs to be made, I am desperate

Can there be an option to call train function for specific brain, and step different brains asynchronously?
I got a reply that its not possible to get more brains learned in one environment atm, but would love to see it happen. Can anyone give direction for this?

[Unity-iOS] NotSupportException

I tried the iOS with TensorFlowSharp example, and got the following error when deploy to a mobile device. It works fine on Unity Editor.

Here is the Error Message:


NotSupportedException: To marshal a managed method, please add an attribute named 'MonoPInvokeCallback' to the method definition.
  at TensorFlow.TFTensor..ctor (System.Int32 value) [0x00000] in <00000000000000000000000000000000>:0 
  at TensorFlow.TFTensor.op_Implicit (System.Int32 value) [0x00000] in <00000000000000000000000000000000>:0 
  at ExampleCommon.ImageUtil.ConstructGraphToNormalizeImage (TensorFlow.TFGraph& graph, TensorFlow.TFOutput& input, TensorFlow.TFOutput& output, TensorFlow.TFDataType destinationDataType) [0x00000] in <00000000000000000000000000000000>:0 
  at ExampleCommon.ImageUtil.CreateTensorFromImageFile (System.String file, TensorFlow.TFDataType destinationDataType) [0x00000] in <00000000000000000000000000000000>:0 
  at TensorTest.Test () [0x00000] in <00000000000000000000000000000000>:0 
 
(Filename: currently not available on il2cpp Line: -1)

Setting up 1 worker threads for Enlighten.
  Thread -> id: 16f54b000 -> priority: 1 

Here is my source code (grab from TensorFlowSharp Example)

using System.IO;
using UnityEngine;
using TensorFlow;

namespace ExampleCommon
{
	public static class ImageUtil
	{
		// Convert the image in filename to a Tensor suitable as input to the Inception model.
		public static TFTensor CreateTensorFromImageFile (string file, TFDataType destinationDataType = TFDataType.Float)
		{
			byte[] contents;
			if (Application.platform == RuntimePlatform.Android)
			{
				WWW reader = new WWW(file);
				while (!reader.isDone) { }

				contents = reader.bytes;
			} else {
				contents = File.ReadAllBytes(file);
			}

			// var contents = File.ReadAllBytes (file);

			// DecodeJpeg uses a scalar String-valued tensor as input.
			var tensor = TFTensor.CreateString (contents);

			TFGraph graph;
			TFOutput input, output;

			// Construct a graph to normalize the image
			ConstructGraphToNormalizeImage (out graph, out input, out output, destinationDataType);

			// Execute that graph to normalize this one image
			using (var session = new TFSession (graph)) {
				var normalized = session.Run (
						 inputs: new [] { input },
						 inputValues: new [] { tensor },
						 outputs: new [] { output });

				return normalized [0];
			}
		}

		// The inception model takes as input the image described by a Tensor in a very
		// specific normalized format (a particular image size, shape of the input tensor,
		// normalized pixel values etc.).
		//
		// This function constructs a graph of TensorFlow operations which takes as
		// input a JPEG-encoded string and returns a tensor suitable as input to the
		// inception model.
		private static void ConstructGraphToNormalizeImage (out TFGraph graph, out TFOutput input, out TFOutput output, TFDataType destinationDataType = TFDataType.Float)
		{
			// Some constants specific to the pre-trained model at:
			// https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip
			//
			// - The model was trained after with images scaled to 224x224 pixels.
			// - The colors, represented as R, G, B in 1-byte each were converted to
			//   float using (value - Mean)/Scale.

			const int W = 224;
			const int H = 224;
			const float Mean = 127.5f;
			const float Scale = 127.5f;

			graph = new TFGraph ();
			input = graph.Placeholder (TFDataType.String);

			output = graph.Cast (graph.Div (
				x: graph.Sub (
					x: graph.ResizeBilinear (
						images: graph.ExpandDims (
							input: graph.Cast (
								graph.DecodeJpeg (contents: input, channels: 3), DstT: TFDataType.Float),
							dim: graph.Const (0, "make_batch")),
						size: graph.Const (new int [] { W, H }, "size")),
					y: graph.Const (Mean, "mean")),
				y: graph.Const (Scale, "scale")), destinationDataType);
		}
	}
}

How do you design around variable amount of states?

How do you design your agent in environments with variable amount of states?

To give an example: Say you have an agent that has to get from location A to location B. In the environment there is a range of "enemies". The agents job is to get from A to B while being rewarded for being as far as possible from the enemies. There will also be obstacles on the path, which the agent have to maneuver around. The tricky part is: While going from A to B more enemies and obstacles will spawn.

My initial thought was to append the "state" list with the spawned enemies and obstacles, but if each index in the list does not represent the same type of state, how does the brain differentiate between an obstacles and an enemy?

How would you go about this? I am relatively new to machine learning so excuse me for my lack of lingo.

PPO: JSONDecodeError


JSONDecodeError Traceback (most recent call last)
in ()
37 info = env.reset(train_mode=train_model)[brain_name]
38 # Decide and take an action
---> 39 new_info = trainer.take_action(info, env, brain_name)
40 info = new_info
41 trainer.process_experiences(info, time_horizon, gamma, lambd)

D:\Unity Projects\ml-agents-master\python\ppo\trainer.py in take_action(self, info, env, brain_name)
51 self.stats['value_estimate'].append(value)
52 self.stats['entropy'].append(ent)
---> 53 new_info = env.step(actions, value={brain_name: value})[brain_name]
54 self.add_experiences(info, new_info, epsi, actions, a_dist, value)
55 return new_info

D:\Unity Projects\ml-agents-master\python\unityagents\environment.py in step(self, action, memory, value)
336 self._conn.send(b"STEP")
337 self._send_action(action, memory, value)
--> 338 return self._get_state()
339 elif not self._loaded:
340 raise UnityEnvironmentException("No Unity environment is loaded.")

D:\Unity Projects\ml-agents-master\python\unityagents\environment.py in _get_state(self)
204 self._data = {}
205 for index in range(self._num_brains):
--> 206 state_dict = self._get_state_dict()
207 b = state_dict["brain_name"]
208 n_agent = len(state_dict["agents"])

D:\Unity Projects\ml-agents-master\python\unityagents\environment.py in _get_state_dict(self)
171 state = self._conn.recv(self._buffer_size).decode('utf-8')
172 self._conn.send(b"RECEIVED")
--> 173 state_dict = json.loads(state)
174 return state_dict
175

d:\program files\python36\lib\json_init_.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352 parse_int is None and parse_float is None and
353 parse_constant is None and object_pairs_hook is None and not kw):
--> 354 return _default_decoder.decode(s)
355 if cls is None:
356 cls = JSONDecoder

d:\program files\python36\lib\json\decoder.py in decode(self, s, _w)
337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):

d:\program files\python36\lib\json\decoder.py in raw_decode(self, s, idx)
353 """
354 try:
--> 355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
357 raise JSONDecodeError("Expecting value", s, err.value) from None

JSONDecodeError: Expecting ',' delimiter: line 98 column 13 (char 1460)

Passing Agent to Decision

Currently, there is no way to know for which agent Decision.Decide or Decision.MakeMemory is called. Knowing it may help in some cases. For example, when agent's state is a large vector for training neural network and can not be easily used to make decision. Agent itself may contain some sort of high level API, which greatly simplifies decider's task.

Custom ~Agent.cs error - Cannot reshape array

I must be overlooking something, anyone know what this is pointing to? :D

valueerror_index_uml

Maybe this works better...
valueerror_uml

It occurs no matter how many states I have within the code, and no matter what values I change in the state parameter within inspector.

Also, when states are set to 0 within inspector, there are no errors except for in game - Where it expects 8, which brings us back to the issue :( :

statesexpectederror_uml

How an agent can change its brain?

In order to have a flexible cooperative architecture (e.g., Mergeable nervous systems for robots, Nature 2017), the ability to chain the brain, at any point, seems essential. I was wondering whether it is possible for the agents to change their brain at runtime? Also, is it possible to form a structure that can be changed at runtime for cooperative tasks?

Some rewards are missing when using frame skip.

When Academy.frameToSkip > 0 only rewards from non-skipped frames are sent to python. Instead rewards of skipped frames should be summed with last reward. Current behaviour may make some games unbeatable.
For example, imagine if Montezuma's Revenge would be replicated in unity. Rewards in this game are very sparse, like 999 of 1000 frames have zero reward. Using current implementation frame skip, non-zero reward will be skipped most of time, and AI won't be able to beat the game at all.

How do i create periodic reward

I am building an agent that periodically hits the ball.

If the ball is not hit for a period longer than 2 sec, i would add a negative reward.
So i've added a float that tracks time

private float m_time_since_last_hit = 0f;

void Update()
{

   m_time_since_last_hit += Time.deltaTime;

   if ( m_time_since_last_hit >2 )
   {
     m_agent.reward = -0.1f;
     m_time_since_last_hit = 0f;    
   }

}
void RegisterHit()
{
    m_agent.reward = 1f;
    m_time_since_last_hit = 0f;
}

But during a simulation, where the time scale is up to a 100, 2 seconds can equal to over 3 minutes of simulated time.

  1. How to i account for the time scale? Can i use m_time_since_last_hit += Time.deltaTime * Time.timeScale; or is scaling of time done differently?

  2. Should i pass m_time_since_last_hit to CollectState() ?

documentation - Observing Training Progress

for some reason the following command does not work in windows 10
tensorboard --logdir='./summaries'

You get "No scalar data was found." on localhost:6006

the following works
tensorboard --logdir=./summaries
or
tensorboard --logdir summaries

CoreBrainInternal throws error when trying to use an internal brain with observations.

This line (197) fails runner.AddInput(graph[graphScope + placeholder.name][0], new float[] { Random.Range(placeholder.minValue, placeholder.maxValue) }).

I grabbed the latest script and it now throws an exception but I've confirmed that epsilon is set on the brain. Debug.Log shows that graph is failing to return anything, the other vars have values (except graphScope, could that be the problem?).

I also noticed when trying to debug that the GridWorld example does not have an internal brain option, related?

Brain screenshot attached.
screen shot 2017-09-26 at 11 04 17 am

Python client freezes if reset() called instantly after creating UnityEnvironment

from unityagents import UnityEnvironment
env = UnityEnvironment(file_name='3DBall')
env.reset()

This code freezes python client 4 of 5 times. If env.reset() is called with some delay (~2 seconds), everything will work as expected. This may happen because python client starting communication before game is fully loaded.
Tested on Python 3.6 / Jupyter / Windows 10 x64.

AttributeError

I am getting an AttributeError while running the 3rd cell of the PPO jupyter notebook. Not sure what is causing this. Please help.

unityml

SIGALRM not present in windows

How do you run the model in ubuntu and unity in windows? SIGALARM signal is not present in windows. After I disabled the 30 second signal for raising execption, it still doesn't seem to be working. In Jupyter, it seems to not leave control, however, The build seem to load.
Bellow is the line,
env = UnityEnvironment(file_name=env_name)

IndexOutOfRangeException: Index was outside the bounds of the array.

While running the 3DBall example, I am getting IndexOutOfBounds exception for the Platform agent. Here's the snippet for the error
capture

I thought changing the value of Action Size parameter in Inspector window to 8 might solve the issue, but that just gives me another error
capture1

Order of Academy Step ad Agent Step

Hello,
I am a little confused about the order of you calling those functions internally. It looks like AcademyStep is called before AgentStep, and the rewards are collected after Agent step and before next AcademyStep, am i right?in this case we have to know the reward of each agent right after its step which might not be possible sometimes not possible. Correct me if I am wrong.

Also, do you have plan to support internal brain's training? I've done it somehow, but just want to know if you will provide better API.

Thank you!

AttributeError: module 'signal' has no attribute 'SIGALRM'

I get the following error when running the Jupyter Notebook step.
This happens in PPO and Basic versions.
I have windows 10 and latest unity 2017.2.0b6 Personal

The Game opens up and the balls drop, but the tables do not move.
The brain is set to external.

I build the unity environment and put the .exe here : ....\ml-agents-master\python

ppo.py
setup.py
3DBall.exe
3DBall_data

here is what I updated in the Jupyter Notebook

General parameters

max_steps = 10000 # Set maximum number of steps to run environment.
run_path = "ppo" # The sub-directory name for model and summary statistics
load_model = False # Whether to load a saved model.
train_model = True # Whether to train the model.
summary_freq = 10000 # Frequency at which to save training statistics.
save_freq = 50000 # Frequency at which to save model.
env_name = "3DBall" # Name of the training environment file.


ERROR MESSAGE

Load the environment
In [3]:

env = UnityEnvironment(file_name=env_name)
print(str(env))
brain_name = env.brain_names[0]

AttributeError Traceback (most recent call last)
in ()
----> 1 env = UnityEnvironment(file_name=env_name)
2 print(str(env))
3 brain_name = env.brain_names[0]

C:\Users\mpliszka\Documents\UnityProjects\ml-agents-master\python\unityagents\environment.py in init(self, file_name, worker_id, base_port)
86 str(file_name)))
87
---> 88 old_handler = signal.signal(signal.SIGALRM, timeout_handler)
89 signal.alarm(30) # trigger alarm in x seconds
90 try:

AttributeError: module 'signal' has no attribute 'SIGALRM'

Problem with new environment

I created a new environment as per the instructions given. Created the build file and tried to run the PPO file. But for some reason it is stuck in the line env.reset(train_mode=train_model). Specifically it is stuck at this statement inside environment.py (line 172)
state = self._conn.recv(self._buffer_size).decode('utf-8')

PPO file was changed to give some random actions at each step. I checked the same for 3D Ball and it is working. I am unable to debug this issue. Please provide some help.

Should amend README not to add large files to git

I installed ML-AgentsWithPlugin.unitypackage and added it to my git, but the package contained large file that exceeded 100MB and then I became not able to push it.

We should amend README not to add such large files to git and to use git LFS.

Unable to get tensorboard reporting to find event file on windows 10 - help request

I have been successful in executing the ppo notebook including importing a bytes file into unity and having it run both in the editor and in building a new standalone instance.

I am successful in having the tensorboard menu display at localhost:6060 but I am unable to get tensorboard to find my event files and to display results - on windows 10, Python3.

// this command from the help file causes an error
tensorboard -- logdir DIRECTORY_PATH --debug
// I can see event files in ./summaries and I have tried using the fully qualified both have localhost:6060 tell me that the files cannot be found.
tensorboard --logdir='./summaries'

documentation + use clarification

I'm trying to implement my own agent and environment and I'm a little unclear about a few things:

The implementations of CollectState() for Ball3DAgent and TennisAgent both seem to be roughly normalizing the floats they are returning. Is this a requirement, or just a best practice?

What's the range of values that can be expected from the float array act in AgentStep ?

MemorySize argument in the Brain, do you define it in the same way you do with the State and Actions ? If so where? Is that only for Heuristic brains?

Issue with setting env_name (syntax error)

Upon setting everything up, setting the env_name seems to have an issue in my build:

File "", line 11
We can reset the environment to be provided with an initial set of observations and states for all the agents within the environment. In ML-Agents, states refer to a vector of variables corresponding to relevant aspects of the environment for an agent. Likewise, observations refer to a set of relevant pixel-wise visuals for an agent.
^
SyntaxError: invalid syntax

The env_name is set as follows:

env_name = "3DBall"
train_mode = True

Also, there are no import errors and the exported .app is within the Python folder.

Thanks!

Need some help with 3DBall example

Setting up dependencies and building 3DBall example for the first time using readme ...

While running the PPO notebook "load the environment" step, there is a timeout. Have tried passing a different workerid and port but that doesn't help. Running all this on Windows 10.

Could someone figure out what went wrong? I pretty much followed instructions in the readme and didn't get fancy with anything.

image

PPO ValueError: need at least one array to concatenate

I'm trying to train my own agent in an environment and getting this error , not sure what I'm misconfiguring in my scene for this to happen

`---------------------------------------------------------------------------
ValueError
Traceback (most recent call last)
in ()
42 if len(trainer.training_buffer['actions']) > buffer_size and train_model:
43 # Perform gradient descent with experience buffer
---> 44 trainer.update_model(batch_size, num_epoch)
45 if steps % summary_freq == 0 and steps != 0 and train_model:
46 # Write training statistics to tensorboard.

/Users/sterlingcrispin/code/Unity-ML/python/ppo/trainer.pyc in update_model(self, batch_size, num_epoch)
139 if self.is_continuous:
140 feed_dict[self.model.epsilon] = np.vstack(training_buffer['epsilons'][start:end])
--> 141 feed_dict[self.model.state_in] = np.vstack(training_buffer['states'][start:end])
142 else:
143 feed_dict[self.model.action_holder] = np.hstack(training_buffer['actions'][start:end])

/Users/sterlingcrispin/anaconda/lib/python2.7/site-packages/numpy/core/shape_base.pyc in vstack(tup)
235
236 """
--> 237 return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
238
239 def hstack(tup):

ValueError: need at least one array to concatenate`

Unity environment freezing on launch

I've got to the point that I can successfully launch the 3D ball example. However, when I run the jupyter notebook file and the app starts automatically, it starts a tiny window and then freezes, and has to be process killed (see image below). Jupyter notebook doesn't show any errors, so I'm not entirely sure what's up. Any ideas as to how to fix this?

image

Thank you!

Issues with Basic example

After train Basic agent with PPO.ipynb and sending back the bytes file to Unity, console shows this error:

UnityAgentsException: Expects arg[0] to be int32 but float is provided CoreBrainInternal.DecideAction () (at Assets/ML-Agents/Scripts/CoreBrainInternal.cs:240) Brain.DecideAction () (at Assets/ML-Agents/Scripts/Brain.cs:312) Academy.DecideAction () (at Assets/ML-Agents/Scripts/Academy.cs:250) Academy.RunMdp () (at Assets/ML-Agents/Scripts/Academy.cs:337) Academy.FixedUpdate () (at Assets/ML-Agents/Scripts/Academy.cs:260)

I have checked that the Action space type is Discrete, any ideas?


Try to use BasicDecision.cs return this error in console:
NullReferenceException: Object reference not set to an instance of an object BasicAgent.AgentStep (System.Single[] act) (at Assets/ML-Agents/Examples/Basic/Scripts/BasicAgent.cs:25) Agent.Step () (at Assets/ML-Agents/Scripts/Agent.cs:209) Brain.Step () (at Assets/ML-Agents/Scripts/Brain.cs:324) Academy.RunMdp () (at Assets/ML-Agents/Scripts/Academy.cs:346) Academy.FixedUpdate () (at Assets/ML-Agents/Scripts/Academy.cs:260)

Easy fix replacing line 9 with this:
return new float[1]{ 1f };

Regards

Windows : No connection could be made because the target machine actively refused it.

On Windows 10 : When trying to run the 3DBall environment in the Jupyter Notebook I get this exception (showed in the game application when using a debug-enabled .exe). The Jupyter notebook then times out because of this


SocketException: No connection could be made because the target machine actively refused it.

  at System.Net.Sockets.Socket.Connect (System.Net.IPAddress[] addresses, System.Int32 port) [0x0011a] in <f044fe2c9e7e4b8e91984b41f0cf0b04>:0 
  at System.Net.Sockets.Socket.Connect (System.String host, System.Int32 port) [0x00007] in <f044fe2c9e7e4b8e91984b41f0cf0b04>:0 
  at ExternalCommunicator.InitializeCommunicator () [0x0004a] in C:\Users\XXXX\Git\ml-agents\unity-environment\Assets\ML-Agents\Scripts\ExternalCommunicator.cs:98 

I turned off Windows firewall so unsure what could cause this still? Do I need to run something as Administrator?

TFException: Shape [-1,24] has negative dimensions

I was running a training session in the PPO notebook and stopped it , then exported the graph

INFO:tensorflow:Restoring parameters from ./models/ppo/model-400000.cptk
Converted 4 variables to const ops.
20 ops in the final graph.

but when I use this .bytes file in Unity I get

TFException: Shape [-1,24] has negative dimensions [[Node: epsilon = Placeholder[dtype=DT_FLOAT, shape=[?,24], _device="/job:localhost/replica:0/task:0/cpu:0"]()]] TensorFlow.TFStatus.CheckMaybeRaise (TensorFlow.TFStatus incomingStatus, System.Boolean last) (at <6ed6db22f8874deba74ffe3e566039be>:0) TensorFlow.TFSession.Run (TensorFlow.TFOutput[] inputs, TensorFlow.TFTensor[] inputValues, TensorFlow.TFOutput[] outputs, TensorFlow.TFOperation[] targetOpers, TensorFlow.TFBuffer runMetadata, TensorFlow.TFBuffer runOptions, TensorFlow.TFStatus status) (at <6ed6db22f8874deba74ffe3e566039be>:0) TensorFlow.TFSession+Runner.Run (TensorFlow.TFStatus status) (at <6ed6db22f8874deba74ffe3e566039be>:0) CoreBrainInternal.DecideAction () (at Assets/ML-Agents/Scripts/CoreBrainInternal.cs:244) Brain.DecideAction () (at Assets/ML-Agents/Scripts/Brain.cs:308) Academy.DecideAction () (at Assets/ML-Agents/Scripts/Academy.cs:250) Academy.RunMdp () (a

not sure why this has a -1 value, I've stopped training prematurely on the Ball3D demo without any problems, anyone have any ideas?

Add Learning Rate Annealing to PPO

Current implementation of PPO uses fixed learning rate for duration of training process. This can produce degenerate models later in training, when a smaller learning rate is necessary.

Learning rate should be annealed over time to 0.

Python client freezes if exception is occured in game

Hi,
If exception is raised in C# code (at Agent.Step in my case), python client freezes at env.step() function. It would be nice to catch an exception, send it to python client, and raise it there. Or at least check for timeout in python client and raise an exception.
Curretnly timeout is disabled because of self._conn.setblocking(1) line. If it's replaced by self._conn.settimeout(10), timeout occurs as it should do.
Tested on Python 3.6 / Jupyter / Windows 10 x64.

AttributeError: module 'signal' has no attribute 'SIGALRM'

On Windows, signal() can only be called with SIGABRT, SIGFPE, SIGILL, SIGINT, SIGSEGV, or SIGTERM. A ValueError will be raised in any other case.

C:\Repos\UnityRF\ml-agents\python\unityagents\environment.py in __init__(self, file_name, worker_id, base_port)
     86                     str(file_name)))
     87 
---> 88         old_handler = signal.signal(signal.SIGALRM, timeout_handler)
     89         signal.alarm(30)  # trigger alarm in x seconds
     90         try:

AttributeError: module 'signal' has no attribute 'SIGALRM'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.