derwenai / gym_example Goto Github PK
View Code? Open in Web Editor NEWAn example implementation of an OpenAI Gym environment used for a Ray RLlib tutorial
Home Page: https://anyscale.com/academy
License: MIT License
An example implementation of an OpenAI Gym environment used for a Ray RLlib tutorial
Home Page: https://anyscale.com/academy
License: MIT License
I'm getting an error by running train.py.
Machine: Macbook pro 2017 - Big Sur - version 11.0.1 (20B29)
Environment: Anaconda
conda 4.9.2
Python 3.8.5
Ray 1.0.1.post1
Tensorflow 2.3.1
Full console log:
WARNING:tensorflow:From /opt/anaconda3/envs/rl/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2020-12-08 22:31:33,163 INFO services.py:1090 -- View the Ray dashboard at http://127.0.0.1:8265
2020-12-08 22:31:36,024 INFO logger.py:200 -- pip install 'ray[tune]' to see TensorBoard files.
2020-12-08 22:31:36,024 WARNING logger.py:342 -- Could not instantiate TBXLogger: No module named 'tensorboardX'.
2020-12-08 22:31:36,026 INFO trainer.py:592 -- Tip: set framework=tfe or the --eager flag to enable TensorFlow eager execution
2020-12-08 22:31:36,026 INFO trainer.py:1064 -- `_use_trajectory_view_api` only supported for PyTorch so far! Will run w/o.
2020-12-08 22:31:36,026 INFO trainer.py:617 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(pid=20565) WARNING:tensorflow:From /opt/anaconda3/envs/rl/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
(pid=20565) Instructions for updating:
(pid=20565) non-resource variables are not supported in the long term
(pid=20564) WARNING:tensorflow:From /opt/anaconda3/envs/rl/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
(pid=20564) Instructions for updating:
(pid=20564) non-resource variables are not supported in the long term
2020-12-08 22:31:48,167 INFO trainable.py:252 -- Trainable.setup took 12.142 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
2020-12-08 22:31:48,167 WARNING util.py:40 -- Install gputil for GPU system monitoring.
WARNING:tensorflow:From /opt/anaconda3/envs/rl/lib/python3.8/site-packages/ray/rllib/policy/tf_policy.py:875: Variable.load (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Prefer Variable.assign which has equivalent behavior in 2.X.
(pid=20565) WARNING:tensorflow:From /opt/anaconda3/envs/rl/lib/python3.8/site-packages/ray/rllib/policy/tf_policy.py:875: Variable.load (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
(pid=20565) Instructions for updating:
(pid=20565) Prefer Variable.assign which has equivalent behavior in 2.X.
(pid=20564) WARNING:tensorflow:From /opt/anaconda3/envs/rl/lib/python3.8/site-packages/ray/rllib/policy/tf_policy.py:875: Variable.load (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
(pid=20564) Instructions for updating:
(pid=20564) Prefer Variable.assign which has equivalent behavior in 2.X.
1 reward -21.00/ -6.90/ 10.00 len 7.94 saved tmp/exa/checkpoint_1/checkpoint-1
2 reward -20.00/ 0.87/ 10.00 len 5.64 saved tmp/exa/checkpoint_2/checkpoint-2
3 reward -19.00/ 5.68/ 10.00 len 3.96 saved tmp/exa/checkpoint_3/checkpoint-3
4 reward -18.00/ 7.16/ 10.00 len 3.32 saved tmp/exa/checkpoint_4/checkpoint-4
5 reward -16.00/ 7.66/ 10.00 len 3.02 saved tmp/exa/checkpoint_5/checkpoint-5
Model: "functional_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
observations (InputLayer) [(None, 11)] 0
__________________________________________________________________________________________________
fc_1 (Dense) (None, 256) 3072 observations[0][0]
__________________________________________________________________________________________________
fc_value_1 (Dense) (None, 256) 3072 observations[0][0]
__________________________________________________________________________________________________
fc_2 (Dense) (None, 256) 65792 fc_1[0][0]
__________________________________________________________________________________________________
fc_value_2 (Dense) (None, 256) 65792 fc_value_1[0][0]
__________________________________________________________________________________________________
fc_out (Dense) (None, 2) 514 fc_2[0][0]
__________________________________________________________________________________________________
value_out (Dense) (None, 1) 257 fc_value_2[0][0]
==================================================================================================
Total params: 138,499
Trainable params: 138,499
Non-trainable params: 0
__________________________________________________________________________________________________
None
2020-12-08 22:32:08,653 INFO trainable.py:481 -- Restored on 192.168.0.3 from checkpoint: tmp/exa/checkpoint_5/checkpoint-5
2020-12-08 22:32:08,653 INFO trainable.py:489 -- Current state after restoring: {'_iteration': 5, '_timesteps_total': None, '_time_total': 20.26572561264038, '_episodes_total': 4752}
Traceback (most recent call last):
File "train.py", line 83, in <module>
main()
File "train.py", line 69, in main
action = agent.compute_action(state)
File "/opt/anaconda3/envs/rl/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 819, in compute_action
preprocessed = self.workers.local_worker().preprocessors[
File "/opt/anaconda3/envs/rl/lib/python3.8/site-packages/ray/rllib/models/preprocessors.py", line 166, in transform
self.check_shape(observation)
File "/opt/anaconda3/envs/rl/lib/python3.8/site-packages/ray/rllib/models/preprocessors.py", line 62, in check_shape
raise ValueError(
ValueError: ('Observation ({}) outside given space ({})!', 9, Box(-1.0, 1.0, (11,), float32))
But how can we pass arguments to the environment? I am struggling with that. I can do it using EnvContext for the training phase but for the prediction where you used gym.make???
Any tip?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.