Giter Site home page Giter Site logo

stanfordnmbl / osim-rl Goto Github PK

View Code? Open in Web Editor NEW
872.0 54.0 249.0 30.66 MB

Reinforcement learning environments with musculoskeletal models

Home Page: http://osim-rl.stanford.edu/

License: MIT License

Python 100.00%
reinforcement-learning kinematics machine-learning biomechanics deep-reinforcement-learning simulation-environment opensim simulator physics

osim-rl's Introduction

NeurIPS 2019: Learn to Move - Walk Around

This repository contains software required for participation in the NeurIPS 2019 Challenge: Learn to Move - Walk Around. See more details about the challenge here. See full documentation of our reinforcement learning environment here. In this document we will give very basic steps to get you set up for the challenge!

Your task is to develop a controller for a physiologically plausible 3D human model to walk or run following velocity commands with minimum effort. You are provided with a human musculoskeletal model and a physics-based simulation environment, OpenSim. There will be three tracks:

  1. Best performance
  2. Novel ML solution
  3. Novel biomechanical solution, where all the winners of each track will be awarded.

To model physics and biomechanics we use OpenSim - a biomechanical physics environment for musculoskeletal simulations.

What's new compared to NIPS 2017: Learning to run?

We took into account comments from the last challenge and there are several changes:

  • You can use experimental data (to greatly speed up learning process)
  • We released the 3rd dimensions (the model can fall sideways)
  • We added a prosthetic leg -- the goal is to solve a medical challenge on modeling how walking will change after getting a prosthesis. Your work can speed up design, prototying, or tuning prosthetics!

You haven't heard of NIPS 2017: Learning to run? Watch this video!

HUMAN environment

Getting started

Anaconda is required to run our simulations. Anaconda will create a virtual environment with all the necessary libraries, to avoid conflicts with libraries in your operating system. You can get anaconda from here https://docs.anaconda.com/anaconda/install/. In the following instructions we assume that Anaconda is successfully installed.

For the challenge we prepared OpenSim binaries as a conda environment to make the installation straightforward

We support Windows, Linux, and Mac OSX (all in 64-bit). To install our simulator, you first need to create a conda environment with the OpenSim package.

On Windows, open a command prompt and type:

conda create -n opensim-rl -c kidzik -c conda-forge opensim python=3.6.1
activate opensim-rl
pip install osim-rl

On Linux/OSX, run:

conda create -n opensim-rl -c kidzik -c conda-forge opensim python=3.6.1
source activate opensim-rl
pip install osim-rl

These commands will create a virtual environment on your computer with the necessary simulation libraries installed. If the command python -c "import opensim" runs smoothly, you are done! Otherwise, please refer to our FAQ section.

Note that source activate opensim-rl activates the anaconda virtual environment. You need to type it every time you open a new terminal.

Basic usage

To execute 200 iterations of the simulation enter the python interpreter and run the following:

from osim.env import L2M2019Env

env = L2M2019Env(visualize=True)
observation = env.reset()
for i in range(200):
    observation, reward, done, info = env.step(env.action_space.sample())

Random walk

The function env.action_space.sample() returns a random vector for muscle activations, so, in this example, muscles are activated randomly (red indicates an active muscle and blue an inactive muscle). Clearly with this technique we won't go too far.

Your goal is to construct a controller, i.e. a function from the state space (current positions, velocities and accelerations of joints) to action space (muscle excitations), that will enable to model to travel as far as possible in a fixed amount of time. Suppose you trained a neural network mapping observations (the current state of the model) to actions (muscle excitations), i.e. you have a function action = my_controller(observation), then

# ...
total_reward = 0.0
for i in range(200):
    # make a step given by the controller and record the state and the reward
    observation, reward, done, info = env.step(my_controller(observation))
    total_reward += reward
    if done:
        break

# Your reward is
print("Total reward %f" % total_reward)

You can find details about the observation object here.

Submission

In order to make a submission to AIcrowd, please refer to this page

Rules

Organizers reserve the right to modify challenge rules as required.

Contributions of participants

Partners

osim-rl's People

Contributors

adamstelmaszczyk avatar amir-abdi avatar carmichaelong avatar chrisdembia avatar ctmakro avatar gautam1858 avatar jenhicks avatar joychopra1298 avatar kidzik avatar peterhj avatar seungjaeryanlee avatar shmuma avatar skbly7 avatar smsong avatar spmohanty avatar vbotics avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

osim-rl's Issues

example in the 'Basice Usage' step() getting slower as step count increases

Hello,

I installed and tried example in the README.md:

from osim.env import GaitEnv

env = GaitEnv(visualize=True)
observation = env.reset()
for i in range(500):
    observation, reward, done, info = env.step(env.action_space.sample())

but after about 200 steps the each step() call cost more than 1 second, and getting slower afterwards.

I add some simple logging and found that the time consumer is:

self.osim_model.manager.integrate(self.osim_model.state)

in osim.py.

what might be the problem?

thanks,

Can't submit because of 404 Client Error: NOT FOUND for url: http://grader.crowdai.org:1729/v1/envs/

adam@adam-ThinkPad-T520 ~/Desktop/running/stuff/osim-rl $ python scripts/submit.py --token XXX
Updating Model file from 30000 to latest format...
Loaded model gait9dof18musc_Thelen_BigSpheres.osim from file /home/adam/Desktop/running/stuff/osim-rl/osim/env/../models/gait9dof18musc.osim
[2017-08-11 14:18:24,879] POST http://grader.crowdai.org:1729/v1/envs/
{"env_id": "Run", "token": "XXX", "version": "1.4.1"}
Traceback (most recent call last):
  File "scripts/submit.py", line 19, in <module>
    observation = client.env_create(args.token)
  File "/home/adam/Desktop/running/stuff/osim-rl/osim/http/client.py", line 57, in env_create
    resp = self._post_request(route, data)
  File "/home/adam/Desktop/running/stuff/osim-rl/osim/http/client.py", line 43, in _post_request
    return self._parse_server_error_or_raise_for_status(resp)
  File "/home/adam/Desktop/running/stuff/osim-rl/osim/http/client.py", line 34, in _parse_server_error_or_raise_for_status
    resp.raise_for_status()
  File "/home/adam/.local/lib/python2.7/site-packages/requests/models.py", line 935, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: NOT FOUND for url: http://grader.crowdai.org:1729/v1/envs/

Failed to start the visualizer inside the virtual environment

When I start the visualizer in the conda virtual environment, I got this:

Updating Model file from 30000 to latest format...
Loaded model gait9dof18musc_Thelen_BigSpheres.osim from file /home/username/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/osim_rl-1.4.1-py2.7.egg/osim/env/../models/gait9dof18musc.osim
freeglut (simbody-visualizer): OpenGL GLX extension not supported by display ':0.0'

And it was stuck like forever.

The system is Ubuntu 14.04
with CUDA 8.0

cannot install opensim-core

Hi, I encounter the following problem when installing opensim-core. Can you please advise?

85% tests passed, 13 tests failed out of 85

Total Test time (real) = 941.27 sec

The following tests FAILED:
6 - testComponentInterface (Failed)
9 - testSTOFileAdapter (OTHER_FAULT)
13 - testMarkerData (OTHER_FAULT)
44 - exampleHopperDevice (Failed)
45 - exampleHopperDeviceAnswers (Failed)
49 - testControllerExample (Failed)
51 - testExampleMain (OTHER_FAULT)
53 - testOptimizationExample (OTHER_FAULT)
55 - testCustomActuatorExample (OTHER_FAULT)
57 - testMuscleExample (OTHER_FAULT)
83 - testCommandLineInterface (Not Run)
84 - python_tests (Failed)
85 - python_examples (Failed)
Errors while running CTest

During the installation there are 2 errors:

Copying _simulation library to python package in build directory.
[ 94%] Built target _simulation
make: *** [all] Error 2

clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [opensim-cmd] Error 1
make[1]: *** [Applications/opensim-cmd/CMakeFiles/opensim-cmd.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

Thanks

NOT FOUND for url: http://grader.crowdai.org:1729/v1/envs/

I am just trying to execute the submission example with a basic random function:

def my_controller(observation):
    return env.action_space.sample()

and I get this error:

[2017-07-05 15:36:29,198] POST http://grader.crowdai.org:1729/v1/envs/
{"env_id": "Run", "token": "f287d48cd4a19cbdf765dfcd4fa5fea2"}
Traceback (most recent call last):
  File "rand_sample.py", line 13, in <module>
    observation = client.env_create(crowdai_token)
  File "/home/alexis/osim-rl/osim/http/client.py", line 52, in env_create
    resp = self._post_request(route, data)
  File "/home/alexis/osim-rl/osim/http/client.py", line 40, in _post_request
    return self._parse_server_error_or_raise_for_status(resp)
  File "/home/alexis/osim-rl/osim/http/client.py", line 32, in _parse_server_error_or_raise_for_status
    resp.raise_for_status()
  File "/home/alexis/miniconda2/envs/opensim-rl/lib/python2.7/site-packages/requests/models.py", line 937, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: NOT FOUND for url: http://grader.crowdai.org:1729/v1/envs/

Is http://grader.crowdai.org:1729 the right adress ?

Example forward integration does not set activations correctly

Hi,

I just starting looking into this project and installed everything. It looks very interesting to explore but I am running into a problem that I can't solve.
When I try to run a forward simulation with 'random' activations (suggested in the documentation on crowdAI), it seems that my model is not performing what it should do, as all the muscles stay unactivated and the model just drops down. I noticed that my action space was empty (box(0,)). I think this should not be happening so probably I am doing something wrong. I could solve it y adding the line self.reset() in the osim.py just before the action space and observation_space in the init method.
Now it seems to work for the first 3-4 integration steps. After that the model shows no activation and falls down.

Regards
Tom

Obstacle's generation

Looking at the latest results in the leaderboard if an agent passes 3 quite regularly place spheres he can move further quit easily not meeting new obstacles anymore. May be it makes sense to generate obstacles on the whole way forward and in not so regular way they are currently generated? And difficulty 2 should have not only 3 obstacles in total, but rather 3 obstacles in average should be generated per some distance, 10 meters for example.

Also the distance the difficulty can be made to increase with the distance, for example maximum possible size of the spheres generated can increase a bit every 10 meters. It's an optional point, but make obstacles to be generated on the whole road is quite important.

Problem about reset difficulty

Hey, I met a simple problem when I ran the code as follow

e = RunEnv(visualize=False)
e.reset(difficulty=0)

and got this error

TypeError: reset() got an unexpected keyword argument 'difficulty'

However, I check the source code in osim/env/run.py

def reset(self, difficulty=2, seed=None):
        super(RunEnv, self).reset()
        self.istep = 0
        self.last_state = self.get_observation()
        self.setup(difficulty, seed)
        self.current_state = self.last_state
        return self.last_state

So I do a sightly change in source code to see what happen.

def reset(self, difficulty=2, seed=None):
        print('breakpoint 1')
        super(RunEnv, self).reset()
        print('breakpoint 2')
        self.istep = 0

Nothing about the breakpoint is printed. It seems like reset() in OsimEnv is called instead of that in RunEnv?
How can I fix it? THX.

The memory the osim env takes increases linearly with the number of steps it took, even if the env gets resetted several times in the middle.

The memory the python process takes increases linearly with the number of steps taken by GaitEnv, is it supposed to be like that?

s = 0
while s<50000:   
    o = env.reset()   
    d = False   
    while not d:   
        o, r, d, i = env.step(env.action_space.sample())   
        s += 1   

This small script takes more than 500MB in my computer, python 2.7 and osx

And while training, the env takes more than 20GB memory for some 1Million steps, am I doing something wrong? There are cases where my training script stopped because of memory errors after training for a day on my little 8GB computer.

[Hacked] the memory might always be leaking, here's a solution

Aug 29 Edit: Interprocess communication implementation changed from Pipe() to Queue(). Queue() = Pipe() + Lock() and therefore will not result in race/starve errors.

Due to the internal implementation of RunEnv(), there are at least two situations known to cause memory/resources leaks, on Windows and reportedly on other platforms:

  1. call RunEnv() more than once in the same process (whether single or multiple threads).

    multiple RunEnv() instances, when created in the same python process, will actually talk to the same Opensim backend, and running multiple simulations interleavingly, so if you are looking for parallelism, this is not an option.

    Starting multiple env = RunEnv() one after another, running single threaded, will still cause strange slowdown problems, even after the garbage collection mechanism automatically destroy the env object for you. The RunEnv()s started later will run more slowly than the earlier ones.

    Therefore RunEnv() is not clean. you should never call it more than once in a single process.

  2. env = RunEnv(), then env.step() for more than 100000 step()s, with or without reset().

    this is reported (and confirmed) in #10 to have severe memory leak. Solution is to completely destroy the environment before the leak became disasterous. Typically they should be destroyed per 100000 or fewer step()s. But again, you cannot completely destroy the env class, as stated in point 1.

There is very little can we do to completely solve the root of the problem; however by using the multiprocessing library in python we can easily overcome this limitation and make the RunEnv() clean again.

You can create a environment wrapper class, and have that class create a process, and have RunEnv() run only within that process. By destroying the class you also destroyed the process, which by design is 100% clean.

Below is my solution.

import multiprocessing,time,random,threading
from multiprocessing import Process, Queue

# separate process that holds a separate RunEnv instance.
# This has to be done since RunEnv() in the same process result in interleaved running of simulations.
def standalone_headless_isolated(pq, cq):
    print('starting headless...',pq,cq)
    try:
        import traceback
        from osim.env import RunEnv
        e = RunEnv(visualize=False)
    except Exception as e:
        print('error on start of standalone')
        traceback.print_exc()
        return

    def floatify(np):
        return [float(np[i]) for i in range(len(np))]

    try:
        while True:
            msg = pq.get()
            # messages should be tuples,
            # msg[0] should be string

            if msg[0] == 'reset':
                o = e.reset(difficulty=2)
                cq.put(floatify(o))
            elif msg[0] == 'step':
                ordi = e.step(msg[1])
                ordi[0] = floatify(ordi[0])
                cq.put(ordi)
            else:
                cq.close()
                pq.close()
                del e
                break
    except Exception as e:
        traceback.print_exc()

    return # end process

# class that manages the interprocess communication and expose itself as a RunEnv.
class ei: # Environment Instance
    def __init__(self):
        self.pretty('instance creating')
        self.newproc()

    # create a new RunEnv in a new process.
    def newproc(self):
        self.pq, self.cq = Queue(1), Queue(1) # two queue needed

        self.p = Process(
            target = standalone_headless_isolated,
            args=(self.pq, self.cq)
        )
        self.p.daemon = True
        self.p.start()
        return

    # send x to the process
    def send(self,x):
        return self.pq.put(x)

    # receive from the process.
    def recv(self):
        r = self.cq.get()
        return r

    def reset(self):
        self.send(('reset',))
        r = self.recv()
        return r

    def step(self,actions):
        self.send(('step',actions,))
        r = self.recv()
        return r

    def kill(self):
            self.send(('exit',))
            self.pretty('waiting for join()...')

            while 1:
                self.p.join(timeout=5)
                if not self.p.is_alive():
                    break
                else:
                    self.pretty('process is not joining after 5s, still waiting...')
            self.pretty('process joined.')

    def __del__(self):
        self.pretty('__del__')
        self.kill()
        self.pretty('__del__ accomplished.')

    # pretty printing
    def pretty(self,s):
        print(('(ei) {} ').format(self.id)+str(s))

you can env = ei() to start an environment, and reset() and step() thru it. you can del env every so often without any problems.

The good news is, you can instantiate more than one ei() and have them run in parallel. This way you can train faster, if your algorithm can exploit parallelism.

How to generate a video?

We are trying the code on Ubuntu, and we managed to generate videos for gym games like CartPole and Atari games by run 'xvfb-run' (Because the display is based on Xvnc).
We used Monitor wrapper to generate videos, but in the 'osim.py':

metadata = {
     'render.modes': ['human'],
     'video.frames_per_second': 50
}

However, 'video_recorder' requires render.modes to be 'rgb_array' or โ€˜ansi'. Does anyone know how to generate a video? Thx

Reward calculation for RunEnv

Hi, I noticed that when you calculate reward you don't update last_state to current_state. The only place where last_state is updated is the reset method.
This means that you don't return one-step reward, but the total reward up to current episode step.
I think you should fix it ๐Ÿ˜„

"AttributeError: can't set attribute" in "Basic Usage" step

Hi,
Thanks for the detailed installation steps. I tried installing and running the environment on Windows 10 with Python 2.7. The installation was pretty smooth and I also got through with the "import opensim" step without errors. However, when I write the 2nd line of Basic Usage - "env = RunEnv(visualize=True)" , I get the Attribute Error. The entire error log is:-
Updating Model file from 30000 to latest format...
Loaded model gait9dof18musc_Thelen_BigSpheres.osim from file C:\Users\arna\Anaconda2\envs\opensim-rl\lib\site-packages\osim\env../models/gait9dof18musc.osim
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\arna\Anaconda2\envs\opensim-rl\lib\site-packages\osim\env\run.py", line 29, in init
super(RunEnv, self).init(visualize = False, noutput = self.noutput)
File "C:\Users\arna\Anaconda2\envs\opensim-rl\lib\site-packages\osim\env\osim.py", line 137, in init
self.spec = Spec()
AttributeError: can't set attribute

Thanks for the help.

Submission error: TypeError: Array is not JSON serializable

I am having trouble submitting. I am on Windows 10, and have followed the setup in the README (Python 2.7, opensim 4.0.0, osim-rl 1.4.1). I have modified the basic script, to a minimal failing example:

import opensim as osim
from osim.http.client import Client
from osim.env import RunEnv

# Settings
remote_base = "http://grader.crowdai.org:1729"
crowdai_token = # something on the form: ec790...

client = Client(remote_base)

# Create environment
observation = client.env_create(crowdai_token)

# IMPLEMENTATION OF YOUR CONTROLLER
env = RunEnv(visualize=False)

while True:
    [observation, reward, done, info] = client.env_step(env.action_space.sample(), True)
    print(observation)
    if done:
        observation = client.env_reset()
        if not observation:
            break

client.submit()

I get the following output

[2017-08-08 11:58:49,009] POST http://grader.crowdai.org:1729/v1/envs/
{"env_id": "Run", "token": "ec790bd1915dc26cc3c164ff2a5c97fd", "version": "1.4.1"}
[2017-08-08 11:58:50,272] POST http://grader.crowdai.org:1729/v1/envs/ec790bd1915dc26cc3c164ff2a5c97fd___48b52647fc/monitor/start/
{"directory": "tmp", "video_callable": false, "force": true, "resume": false}
[2017-08-08 11:58:51,540] POST http://grader.crowdai.org:1729/v1/envs/ec790bd1915dc26cc3c164ff2a5c97fd___48b52647fc/reset/
null
Updating Model file from 30000 to latest format...
Loaded model gait9dof18musc_Thelen_BigSpheres.osim from file C:\Anaconda\envs\opensim-rl\lib\site-packages\osim\env\../models/gait9dof18musc.osim
Traceback (most recent call last):
  File "env_submit.py", line 55, in <module>
    [observation, reward, done, info] = client.env_step(env.action_space.sample(), True)
  File "C:\Anaconda\envs\opensim-rl\lib\site-packages\osim\http\client.py", line 72, in env_step
    resp = self._post_request(route, data)
  File "C:\Anaconda\envs\opensim-rl\lib\site-packages\osim\http\client.py", line 39, in _post_request
    logger.info("POST {}\n{}".format(url, json.dumps(data)))
  File "C:\Anaconda\envs\opensim-rl\lib\json\__init__.py", line 244, in dumps
    return _default_encoder.encode(obj)
  File "C:\Anaconda\envs\opensim-rl\lib\json\encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "C:\Anaconda\envs\opensim-rl\lib\json\encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "C:\Anaconda\envs\opensim-rl\lib\json\encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: array([ 0.5488135 ,  0.71518937,  0.60276338,  0.54488318,  0.4236548 ,
        0.64589411,  0.43758721,  0.891773  ,  0.96366276,  0.38344152,
        0.79172504,  0.52889492,  0.56804456,  0.92559664,  0.07103606,
        0.0871293 ,  0.0202184 ,  0.83261985]) is not JSON serializable

error when running example.py --visualize --train

When following directions in Training Your First Model and running python 2.7.13 (conda 4.3.22), I get the following error. Please let me know any further info that I can provide.

Training for 10000 steps ...
Traceback (most recent call last):
  File "example.py", line 86, in <module>
    agent.fit(env, nb_steps=nallsteps, visualize=False, verbose=1, nb_max_episode_steps=env.timestep_limit, log_interval=10000)
  File "/Users/me/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/rl/core.py", line 70, in fit
    observation = deepcopy(env.reset())
  File "/Users/me/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/osim/env/run.py", line 53, in reset
    super(RunEnv, self).reset()
  File "/Users/me/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/gym/core.py", line 104, in reset
    return self._reset()
  File "/Users/me/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/osim/env/osim.py", line 146, in _reset
    return self.get_observation()
  File "/Users/me/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/osim/env/run.py", line 133, in get_observation
    obstacle = self.next_obstacle()
  File "/Users/me/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/osim/env/run.py", line 111, in next_obstacle
    return map(lambda xy: xy[0]-xy[1], [x for x in zip(obstacle, [xy[0],0,0])])
NameError: global name 'xy' is not defined

env.reset() causing SimTK Exception thrown Error

I am trying to call env.reset() after 50 timestep but I got the following error:

std::exception in 'bool OpenSim::Manager::integrate(SimTK::State &)': SimTK Exception thrown at AbstractIntegratorRep.cpp:428

Can you please advice how to reset the environment properly?
Thanks

from osim.env import GaitEnv
env=GaitEnv(visualize=True)
observation env.reset()
print observation

for i in range(10000):
	print i
 	observation, reward, done, info	= env.step(env.action_space.sample())
	if (i == 50):
		env.reset()

submit.py || Connection error

Hi,

I am getting connection error while trying to execute submit.py.

Traceback (most recent call last):

File "submit.py", line 19, in
observation = client.env_create(args.token)
File "/Users/Ish/anaconda/envs/opensim-rl/lib/python2.7/site-packages/osim/http/client.py", line 52, in env_create
resp = self._post_request(route, data)
File "/Users/Ish/anaconda/envs/opensim-rl/lib/python2.7/site-packages/osim/http/client.py", line 39, in _post_request
data=json.dumps(data))
File "/Users/Ish/anaconda/envs/opensim-rl/lib/python2.7/site-packages/requests/sessions.py", line 549, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/Users/Ish/anaconda/envs/opensim-rl/lib/python2.7/site-packages/requests/sessions.py", line 502, in request
resp = self.send(prep, **send_kwargs)
File "/Users/Ish/anaconda/envs/opensim-rl/lib/python2.7/site-packages/requests/sessions.py", line 612, in send
r = adapter.send(request, **kwargs)
File "/Users/Ish/anaconda/envs/opensim-rl/lib/python2.7/site-packages/requests/adapters.py", line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",))

timestep limit for a single simulation

The timestep limit for a particular simulation should be 500 (https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/osim.py#L70).

But while analysing the training logs to see the variation in path-lengths, I noticed there were quite a few simulations which run for more than 500 steps.
The reason being, I use the done response as the feedback for ending a simulation, and would expect it to end a simulation after 500 steps.

Is this an expected functionality ? or has it sneaked in by mistake ?
I think, the is_done (https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/run.py#L75) function should handle the upper bound on the max steps, at least for the grader.

NameError: name 'RunEnv' is not defined

After following instructions for Windows 10 install RunEnv can't be imported:

>>> from osim.env import RunEnv
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name RunEnv

While import of the GaitEnv is working but running with the visualization don't:

RuntimeError: std::exception in 'SimTK::State & OpenSim::Model::initSystem()': SimTK Exception thrown at VisualizerProtocol.cpp:163:
  Error detected by Simbody method VisualizerProtocol::ctor(): Unable to spawn executable 'simbody-visualizer' from directories:
  c:\Users\Viktor\Anaconda3\envs\opensim-rl\
  C:\Users\Viktor\Anaconda3\envs\opensim-rl/Library/simbody/bin/Final system error was errno=2 (No such file or directory).

Ghost obstacles hurt

I tried to log everything during training, especially the positions of obstacles my agent have seen.

Psuedo code:

have_seen = []
for each observation:
    obstacle_absolute = obstacle_relative + pelvis_x
    if obstacles_absolute not present in have_seen:
        if obstacle_absolute < any x in have_seen:
            raise NewObstacleAreCloserToOriginError
        have_seen.append(obstacles_absolute)
        if length of have_seen> 3:
            raise ObstaclesTooMuchError
    else:
        (this obstacle was already seen)

my actual python implementation at: https://github.com/ctmakro/stanford-osrl/blob/master/observation_processor.py#L196-L215

problem description:

  1. the first observation (from reset()) does not contain the closest obstacle. Instead it reads [100,0,0] meaning no obstacle is ahead. I have already explicitly handled that case in my code. I'm not sure if that's fixed on master.

  2. sometimes ghost obstacles came out of nowhere. They only exist for like one frame, then disappears. Not very often though, about once every 50-100 episodes. With my algorithm above, those situation will result in either NewObstacleAreCloserToOriginError or ObstaclesTooMuchError.

  3. I believe there's no bug in my algorithm, because when Errors are not raised, my agent can run through the environment and score >15 points. If this is a bug on my side, then ghost obstacles should came up more often and have my agent killed more often.

  4. When logging the Error I also log the current state of the obstacle buffer (the variable have_seen as shown above). Here are three samples from my console:

[[1.361221778390644, -0.011292036961017608, 0.05461492299694686], [2.5194703733694803, -0.0017294513769069337, 0.27494896696570115], [3.2016361046625708, -0.012949957370803934, 0.06392466839510719], [4.840752657985693, -0.0036897576500632386, 0.14875511812424147]]
(@ step 179)What the fuck you just did! Why num of balls became greater than 3!!!
(agent) something wrong on step(). episode teminates now
Traceback (most recent call last):
  File "D:\GitHub\stanford-osrl\ddpg2.py", line 385, in play
    observation, reward, done, _info = env.step(action_out) # take long time
  File "D:\GitHub\stanford-osrl\multi.py", line 37, in step
    o = self.obg(oo)
  File "D:\GitHub\stanford-osrl\multi.py", line 21, in obg
    processed_observation, self.old_observation = go(plain_obs, self.old_observation, step=self.stepcount)
  File "D:\GitHub\stanford-osrl\observation_processor.py", line 219, in generate_observation
    addball_if_new()
  File "D:\GitHub\stanford-osrl\observation_processor.py", line 213, in addball_if_new
    raise Exception('ball number greater than 3.')
Exception: ball number greater than 3.
ball number greater than 3.

the absolute x position of the obstacles (I called them balls in my code), as shown above, are [1.36, 2.51, 3.20, 4.84].

1.0187754964007407 [[1.8462540880564555, 0.008053394783150494, 0.052140473594560394]]
(@ step )28)Damn! new ball closer than existing balls.
(agent) something wrong on step(). episode teminates now
Traceback (most recent call last):
  File "D:\GitHub\stanford-osrl\ddpg2.py", line 385, in play
    observation, reward, done, _info = env.step(action_out) # take long time
  File "D:\GitHub\stanford-osrl\multi.py", line 37, in step
    o = self.obg(oo)
  File "D:\GitHub\stanford-osrl\multi.py", line 21, in obg
    processed_observation, self.old_observation = go(plain_obs, self.old_observation, step=self.stepcount)
  File "D:\GitHub\stanford-osrl\observation_processor.py", line 219, in generate_observation
    addball_if_new()
  File "D:\GitHub\stanford-osrl\observation_processor.py", line 203, in addball_if_new
    raise Exception('new ball closer than the old ones.')
Exception: new ball closer than the old ones.
new ball closer than the old ones.

the absolute x position of the incoming obstacle is 1.01, less than the one already in the buffer, which is 1.84. This should only happen if my agent fell backwards and hit an obstacle he haven't seen before, which he should definitely saw, because it's the first obstacle.

ep 369 / 200000 times: 1 noise_level 0.05
1.8385624314724573 [[2.5194703733694803, -0.0017294513769069337, 0.27494896696570115], [3.2016361046625708, -0.012949957370803934, 0.06392466839510719], [4.840752657985693, -0.0036897576500632386, 0.14875511812424147]]
(@ step )175)Damn! new ball closer than existing balls.
(agent) something wrong on step(). episode teminates now
Traceback (most recent call last):
  File "D:\GitHub\stanford-osrl\ddpg2.py", line 385, in play
    observation, reward, done, _info = env.step(action_out) # take long time
  File "D:\GitHub\stanford-osrl\multi.py", line 37, in step
    o = self.obg(oo)
  File "D:\GitHub\stanford-osrl\multi.py", line 21, in obg
    processed_observation, self.old_observation = go(plain_obs, self.old_observation, step=self.stepcount)
  File "D:\GitHub\stanford-osrl\observation_processor.py", line 219, in generate_observation
    addball_if_new()
  File "D:\GitHub\stanford-osrl\observation_processor.py", line 203, in addball_if_new
    raise Exception('new ball closer than the old ones.')
Exception: new ball closer than the old ones.
new ball closer than the old ones.

Same as above, except this time not only the new obstacle (at 1.83) is closer to origin than the old ones(at [2.51, 3.20, 4.84]), but also we have four obstacles in total.

Errors on a vanilla Ubuntu 16.04 install

I ran into this on a vanilla Ubuntu 16.04 install after setting up everything through conda.

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/ubuntu/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/opensim/__init__.py", line 1, in <module>
    from simbody import *
  File "/home/ubuntu/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/opensim/simbody.py", line 28, in <module>
    _simbody = swig_import_helper()
  File "/home/ubuntu/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/opensim/simbody.py", line 24, in swig_import_helper
    _mod = imp.load_module('_simbody', fp, pathname, description)
ImportError: libquadmath.so.0: cannot open shared object file: No such file or directory

Can be fixed by :

sudo apt-get install libquadmath0

Pickling RunEnv

How to serialize state of environment to correct restore saved state later?
If I remove pelvis from dict in getstate and pickle the dict, will it work?

Obstacle's observations

At the moment we get next obstacle position and radius as observations:
https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/run.py#L110

I can see a few potential issues with this approach that can affect learning:

  1. When the agent walks not too well and learns how to surpass an obstacle it it's body can swing and pelvis position X oscillate back and forth a bit above the sphere and next obstacle observation can change a lot during this waving, jumping from the current sphere position to the next sphere position a few times which is not very good for training and it's not good to have such a large jumps in observation values in general, when character position doesn't change a lot.

  2. When the pelvis X coordinate becomes larger than current obstacle position the observations jump to the next one, and agent "forgets" about obstacle which he just passed, but one leg can still be behind this old obstacle and the fact that the agent is missing information about the position and radius of this obstacle also doesn't help learning good locomotion policy.

It can be potentially a quite large and breaking change but may be observation received about obstacles can be updated? For example send information about 2 nearest obstacles?

Introducing energy penalty to the reward

Hi,

I've noticed in the previous contest - Learning to walk and in the current one most of the solutions were jumping ones. But humans usually don't use jumps for locomotion. I'm sure that experts in biomechanics can tell a lot why jumps aren't good for walking and running, I suspect one of the reasons is that jumping with the same speed spends much more energy than walking. So may be it makes sense to introduce new small biologically inspired penalty term to the reward in such way that jumping still can be a viable solution, but walking/running the same distance will receive larger reward.

Exception in the grader

[2017-06-22 02:49:16,055] Exception on /v1/envs/ [POST]
Traceback (most recent call last):
  File "/home/ubuntu/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/flask/app.py", line 1988, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/ubuntu/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/flask/app.py", line 1641, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/ubuntu/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/flask/app.py", line 1544, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/ubuntu/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/flask/app.py", line 1639, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/ubuntu/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/flask/app.py", line 1625, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "gym_http_server.py", line 324, in env_create
    instance_id = envs.create(env_id, token)
  File "gym_http_server.py", line 110, in create
    env = osim_envs[env_id](visualize=False)
  File "/home/ubuntu/SDF/osim-rl/osim/env/run.py", line 30, in __init__
    super(RunEnv, self).__init__(visualize = False, noutput = self.noutput)
  File "/home/ubuntu/SDF/osim-rl/osim/env/osim.py", line 114, in __init__
    self.osim_model = Osim(self.model_path, self.visualize)
  File "/home/ubuntu/SDF/osim-rl/osim/env/osim.py", line 36, in __init__
    self.brain.addActuator(self.muscleSet.get(j))
  File "/home/ubuntu/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/opensim/simulation.py", line 25876, in addActuator
    return _simulation.Controller_addActuator(self, actuator)
RuntimeError: std::exception in 'void OpenSim::Controller::addActuator(OpenSim::Actuator const &)': PropertyTable::updAbstractPropertyByIndex(): index 0 out of range (0 properties in table).

I have been seeing this a lot lately in the grader error logs.
Any clue about the source of the error ?

Actions outside [0,1] cause instant changes to activations bypassing OpenSim muscle model

From the README:

action - a list of length 18 of continuous values in [0,1] corresponding to excitation of muscles.

Also in code, https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/osim.py#L157:

    def activate_muscles(self, action):
        if np.any(np.isnan(action)):
            raise ValueError("NaN passed in the activation vector. Values in [0,1] interval are required.")
        brain = opensim.PrescribedController.safeDownCast(self.osim_model.model.getControllerSet().get(0))
        functionSet = brain.get_ControlFunctions()

        for j in range(functionSet.getSize()):
            func = opensim.Constant.safeDownCast(functionSet.get(j))
            func.setValue( float(action[j]) )

"Values in [0,1] interval are required" - but this is not checked.

This alone wouldn't be an issue. The issue is that further down the line OpenSim seems to work with actions outside [0,1] range.

One can easily test it, actions outside [0,1] sent from Python simply work.

And seems to me they are not (desirably) truncated to [0,1]. How do I know?

By mistake, code I used was outputting actions outside of [0,1] and it trained some reasonable looking solutions (e.g. scoring > 4). I saved that model.

I added clipping actions to [0,1]. Now if I test the saved model (which was trained without clipping) it just falls, reward close to 0. So, actions outside [0,1] were interpreted in some way by OpenSim but not clipped to [0,1].

Now competitors are split in a couple of groups:

  1. Ones that don't know about that possibility, follow README and stick to [0,1].
  2. Ones that know about that possibility, but decided to stick to [0,1].
  3. Ones that don't know about that possibility, by accident don't stick to [0,1].
  4. Ones that know about that possibility and decided to use it, don't stick to [0,1].

I don't know if using action space beyond [0,1] is beneficial, could be.

Related: Yongjin asked about clipping on the competition forum.

Pickling OsimEnv

Some libraries which are capable of parallel processing get the following error when trying to pickle the OsimEnv: TypeError: can't pickle SwigPyObject objects.

I check the code and see that the cause:

state['pelvis']  <class 'opensim.simulation.PlanarJoint'>
<opensim.simulation.PlanarJoint; proxy of <Swig Object of type 'OpenSim::PlanarJoint *' at 0x7fefa7e07ae0> > 

What should I do to make state['pelvis'] pickleable, or convert it to other valid data types?

NaN breaks the integrator

When we get NaN as an action, i.e. when we set activations to NaN. We get an error of the integrator:

std::exception in 'bool OpenSim::Manager::integrate(SimTK::State &,double)': SimTK Exception thrown at AbstractIntegratorRep.cpp:428:
  Integrator step failed at time 0 apparently because:
SimTK Exception thrown at AbstractIntegratorRep.cpp:547:
  Error detected by Simbody method AbstractIntegrator::takeOneStep(): Unable to advance time past 0.
  (Required condition 't1 > t0' was not met.)

This makes sense and there should be an error but we can catch it on the python level and make the info much more meaningful.

Simulation speed slower than real time

Hi, thanks for hosting this challenge! For us the simulation (without visualization) runs at about 30 steps per second of real time. With the control frequency of 0.01 seconds, this means the simulation runs 3.33 times slower than real time. This seems quite slow. Is this performance expected or is there something wrong with our setup?

"replaying" a sequence of actions in the same environment has different behaviour

So, for the generation of the GIFs, I have been collecting the actions submitted by the users, and then replaying them by starting the env with the seed used by the grader.
But the behaviour is completely different, even if I run it in the exact same server in the same conda env.

To validate it, you might collect the action+observation pairs in a single simulation, and then run the whole thing again using the same seed, and measure the difference in observation at each step.

Error with execution of the 200 iterations of the simulation

Hi,

Tried default installation of the python 2.7 version of opensim-rl and hit an error when started script from the Basic usage chapter - execution of the 200 iterations. During installation I exactly followed instructions except of the name of conda enviroment. You can find a log below:

(opensim-rl27) apexteam@apexubuntu:~/vmakoviychuk/Deep_RL/osim-rl27$ python -c "import opensim"
(opensim-rl27) apexteam@apexubuntu:~/vmakoviychuk/Deep_RL/osim-rl27$ python
Python 2.7.13 |Continuum Analytics, Inc.| (default, Dec 20 2016, 23:09:15)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> from osim.env import RunEnv
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named osim.env
>>> import opensim
>>> from osim.env import RunEnv
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named osim.env

osim.http.client.ServerError on submission

I'm using linux and get this on submission
it fails on client.env_create('my key')
Loaded model gait9dof18musc_Thelen_BigSpheres.osim from file /home/anton/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/osim/env/../models/gait9dof18musc.osim [2017-07-14 10:04:42,128] POST http://grader.crowdai.org:1729/v1/envs/ {"env_id": "Run", "token": "my key"} Traceback (most recent call last): File "submit.py", line 24, in <module> observation = client.env_create('my key') File "/home/anton/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/osim/http/client.py", line 52, in env_create resp = self._post_request(route, data) File "/home/anton/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/osim/http/client.py", line 40, in _post_request return self._parse_server_error_or_raise_for_status(resp) File "/home/anton/anaconda3/envs/opensim-rl/lib/python2.7/site-packages/osim/http/client.py", line 31, in _parse_server_error_or_raise_for_status raise ServerError(message=j["message"], status_code=resp.status_code) osim.http.client.ServerError
Please help to resolve this issue

Duplicated DDPG agent scripts

Hi Lukasz,

Currently there are 2 DDPG training scripts, in osim-rl/osim/sandbox/train.ddpg.py and osim-rl/scripts/keras-rl/train.ddpg.py

Is this intended? The osim-rl/scripts/keras-rl/train.ddpg.py is newer but it doesn't contain option to chose standing environment. And human env also isn't imported.

The videos SUCK, and here's why

Edit: don't read this, scroll the page down

The osim-rl-grader is forked from gym, therefore I cannot file an issue on it, so I'll just file it here.

https://github.com/kidzik/osim-rl-grader/blob/master/worker_dir/simulate.py#L32-L34

according to the code, you store each submission's action histories in Redis, then generate MP4s by reading from Redis and simulate with the same environment again.

except that you can't: by using the same seed, you generate the same environment, but you can't guarantee the same dynamics given the same actions. The floating point operation is not perfect but chaotic due to precision limitations, and that chaotic error accumulates frame by frame and causes inaccuracy later in re-simulation.

(since the RunEnv is highly nonlinear with a high FPS, this problem became more apparent.)

so the videos on the leaderboard apparently scored much less than the participant's score suggested. (In case you wonder: it's NOT due to the participant's poor performance in the first run.)

the correct way to do this: generate PNGs on the fly during submission, and convert them to MP4s later. If doing so causes too much overhead, consider logging the state of the armature and replay that state for later image generation, instead of replaying the actions.

The videos would look much better that way, making this competition exceeding DeepMind's youtube video on walking agent in popularity, which is apparently what this whole thing is all for.

Make 3.5+ the main supported version of the Python for the challenge

Hi,

I'd like to suggest to make Python 3.5.2, not 2.7 the main used version for the challenge. Python 2.7 is quit old and obsolete version, all the main deep learning frameworks are supporting 3.5 and even 3.6+ for quit a long time. Moreover some of standard RL libraries like rllab: https://github.com/openai/rllab or the new baselines are officially supported only Python 3.5+: https://github.com/openai/baselines/blob/master/setup.py

So I suppose there are no serious reason to make Python 2.7 the main supported version for the challenge directed toward the future. Moreover if Linux users have some choice, making 2.7 the main supported version blocks almost any chance of running osim-rl on Windows and instructions for Window installation won't make a lot of sense as on Windows 10 TensorFlow support only Python 3.5.2. And probably the support of the 3.6 will be added soon. As a result Keras-RL with TensorFlow backend won't be working at all.

May be not only for me it'll be more convenient to do development and testing of already trained agents on Windows but train on Linux PC's.

Best regards,
Viktor

P. S. And even on my Ubuntu PC all the DL and RL libraries that I plan to use were already installed and set in the Python 3.5+ environment. It will be much more convenient if Osim-RL can be be used in already set and tested environment too.

Can't submit because of 405 from grader.crowdai.org:1729/v1/envs/

With HEAD at 8572f34:

$ python submit.py --token XXX
Updating Model file from 30000 to latest format...
Loaded model gait9dof18musc_Thelen_BigSpheres.osim from file /home/adam/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/osim/env/../models/gait9dof18musc.osim
[2017-07-13 20:09:06,334] POST http://grader.crowdai.org:1729/v1/envs/
{"env_id": "Run", "token": "XXX"}
Traceback (most recent call last):
  File "submit.py", line 16, in <module>
    observation = client.env_create(args.token)
  File "/home/adam/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/osim/http/client.py", line 52, in env_create
    resp = self._post_request(route, data)
  File "/home/adam/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/osim/http/client.py", line 40, in _post_request
    return self._parse_server_error_or_raise_for_status(resp)
  File "/home/adam/anaconda2/envs/opensim-rl/lib/python2.7/site-packages/osim/http/client.py", line 31, in _parse_server_error_or_raise_for_status
    raise ServerError(message=j["message"], status_code=resp.status_code)
osim.http.client.ServerError
$ curl http://grader.crowdai.org:1729/v1/envs/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>405 Method Not Allowed</title>
<h1>Method Not Allowed</h1>
<p>The method is not allowed for the requested URL.</p>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.