I tried to log everything during training, especially the positions of obstacles my ag

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

on FourObstacleError I made a dump. <a target="_blank" rel="noopener noreferrer no

I'll paste all the problems I found. psoas isn't constant:

i tried to plot, well this is strange: <a target="_blank" rel="noopener noreferrer

Ghost obstacles hurt about osim-rl HOT 11 CLOSED

stanfordnmbl commented on May 18, 2024 1

Ghost obstacles hurt

from osim-rl.

Comments (11)

kidzik commented on May 18, 2024

Very interesting! Can you please check if env.env_desc['obstacles'] gives the same/similar obstacles? The exact description of the current environment should be there.

from osim-rl.

ctmakro commented on May 18, 2024

to reproduce, you could simply log every observations of every episode, and check them with the algorithm above. my code was checking on the fly during training.

from osim-rl.

ctmakro commented on May 18, 2024

@kidzik i train mainly with remote environment so by the time I wrote previous post that's not possible. I will try that on local machine with modified env to collect env_desc.

I think it might not be a problem of the obstacle list though, since the obstacle observations are generated by comparing pelvis_x with the obstacle list. It might be bugs that corrupted pelvis_x or the calculation of obstacle_relative.

from osim-rl.

kidzik commented on May 18, 2024

That's true, I meant it rather as a sanity check, also to check which of the obstacles is wrong (if it's always the first or the last one, it may speed up debugging).

One important thing here is the logic of the obstacles sensor. Current semantics are as follows (besides the error of the first observation):

if all obstacles are ahead, return the relative position of the first one, obstacle_x - pelvis_x
if pelvis_x passed an obstacle, it is pelvis_x > obstacle_x + obstacle_radius return the relative position of the next obstacle, i.e. obstacle_x - pelvis_x
if there are no obstacles ahead, return [100,0,0]

as implemented here https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/run.py#L110-L120

Note that the second point makes it a little messy: agent is likely to see an obstacle with negative relative distance. Moreover, if one obstacle was covering another it may show up after passing the first one.

The actual list of obstacles is created here https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/run.py#L228-L262 and I don't see any reason why it could be other than 3 since the code is straightforward. If it's a problem on osim-rl side, it should rather be in the sensing part.

Yet, the example you gave is still not covered by this case.

from osim-rl.

ctmakro commented on May 18, 2024

Thank you. I know the logic of generation of obstacles very well (we all do :) ).
There's one problem regarding the sensor though: you sorted the list of obstacles at the beginning, then assume they should be detected in that order by iterating thru the list. https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/run.py#L256-L257
You sort them by obstacle_x; but when iterating thru the list, you compare the pelvis_x with obstacle_x+radius, which may not be in the same ascending order as obstacle_x.

which will result in undercounting (the client will never 'see' one of the obstacles throughout 1000 steps) in some cases, this one for example:

which is not strictly a bug since this unseen obstacle does not affect the agent's performance. it just made the counting inaccurate.

from osim-rl.

kidzik commented on May 18, 2024

Yes, that's the reason I mentioned the exact procedure here. As you say, it should not affect performance much, while it might affect counting.
In either case, it still doesn't explain 4 obstacles...

from osim-rl.

ctmakro commented on May 18, 2024

on FourObstacleError I made a dump.

dump.zip
I'm currently examining it.

from osim-rl.

ctmakro commented on May 18, 2024

I'll paste all the problems I found.

psoas isn't constant:
psoas | psoas
-- | --
1.094567 | 0.796778
0.992841 | 0.991891
0.992841 | 0.991891
0.992841 | 0.991891
0.992841 | 0.991891
0.992841 | 0.991891
0.992841 | 0.991891
0.992841 | 0.991891
0.992841 | 0.991891
0.992841 | 0.991891
0.992841 | 0.991891
0.992841 | 0.991891
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951
0.940272 | 1.144951

from osim-rl.

ctmakro commented on May 18, 2024

i tried to plot, well this is strange:

it seems the data i collected from the environment contains two consecutive episodes rather than one. I guess there could be something extremely wrong in my parallelization code. so now I'll go check my own code and see if it indeed is my problem.

from osim-rl.

ctmakro commented on May 18, 2024

Update: the messed up observations are likely to be my problem, this issue could be closed.
Still the values in the first observation will always be wrong, due to osim-rl's implementation:
https://github.com/stanfordnmbl/osim-rl/blob/master/osim/env/run.py#L57-L63

    def reset(self, difficulty=2, seed=None):
        super(RunEnv, self).reset()
        self.istep = 0
        self.last_state = self.get_observation()
        self.setup(difficulty, seed)
        self.current_state = self.last_state
        return self.last_state

the last_state was obtained before setup(), causing last_state not representing the actual model state after reset(). The two lines should really be switched.

from osim-rl.

kidzik commented on May 18, 2024

Great! At this point, it's a duplicate of #53, so I'm closing this one.

from osim-rl.

Ghost obstacles hurt about osim-rl HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent