🐛 Bug Hello. I am trying to reproduce some algorithms or experime

This may be due to a learning rate too high, see <a class="issue-link js-issue-link" d

after I change the hyperparams from <div class="snippet-clipboard-content notransl

[Bug]: Nan Problems for SAC, TQC, for AntBulletEnv-v0, HalfCheetahBulletEnv-v0 about rl-baselines3-zoo HOT 9 OPEN

ZJEast commented on June 5, 2024

[Bug]: Nan Problems for SAC, TQC, for AntBulletEnv-v0, HalfCheetahBulletEnv-v0

from rl-baselines3-zoo.

Comments (9)

qgallouedec commented on June 5, 2024 2

For TD3, I only found two runs where you have an explosion of the losses, but this didn't lead to the bug:
https://wandb.ai/openrlbenchmark/sb3/runs/2qdjqemd (Walker2DBulletEnv-v0)
https://wandb.ai/openrlbenchmark/sb3/runs/ffc7kx3m (BipedalWalkerHardcore-v0)
What a wonderful tool openrlbenchmark is, ping @vwxyzjn ;)

from rl-baselines3-zoo.

qgallouedec commented on June 5, 2024

This may be due to a learning rate too high, see #156 (comment); do you use the default hyperparams?

Also related (and probably duplicate): DLR-RM/stable-baselines3#1401 and DLR-RM/stable-baselines3#1418

from rl-baselines3-zoo.

ZJEast commented on June 5, 2024

yes, I use the default hyperparams, I try different learning rate later.

from rl-baselines3-zoo.

araffin commented on June 5, 2024

Hello,
thanks for sharing the bug report.
Does the NaN happen only for some runs or for all runs?
Could you log and share a failed run using W&B? (that would allow us to take a look at all the logged data)

I also assume you are using pybullet gymnasium repo?

I'll try to reproduce the issue in the meantime.

Also related: DLR-RM/stable-baselines3#1372 changing to AdamW might solve the problem too.

from rl-baselines3-zoo.

ZJEast commented on June 5, 2024

I have tried TD3, SAC, TQC on some pybullet envs. And it only happens for the task I mention, the others is fine.
I install pybullet env by 'pip install -r ./requirements.txt' .

I can upload some log file.

sac-AntBulletEnv-v0.zip
sac-HalfCheetahBulletEnv-v0.zip
tqc-AntBulletEnv-v0.zip
tqc-HalfCheetahBulletEnv-v0.zip

from rl-baselines3-zoo.

araffin commented on June 5, 2024

Thanks =)

Looking at the log it seems to be due to an explosion of std (and you are using a much larger budget that the one we were using by default).
So, setting use_expln=True (and maybe using AdamW) should solve your issue.

I would appreciate a PR that adds this parameter =)

Hmm, for TD3 it is weird if it happens as it doesn't rely on any distribution.

EDIT: I guess the issue is similar to Stable-Baselines-Team/stable-baselines3-contrib#146 by @qgallouedec

from rl-baselines3-zoo.

qgallouedec commented on June 5, 2024

Bug already encountered in openrlbenchmark, ~~I might have forgotten to report it~~: https://wandb.ai/openrlbenchmark/sb3/runs/27cez5ua
EDIT: I did report it, you're right @araffin ;)

from rl-baselines3-zoo.

ZJEast commented on June 5, 2024

after I change the hyperparams from

policy_kwargs: "dict(log_std_init=-3, net_arch=[400, 300])"

policy_kwargs: "dict(log_std_init=-3, net_arch=[400, 300], use_expln=True)"

this problem never happens again, so let's close this issue

from rl-baselines3-zoo.

araffin commented on June 5, 2024

Thanks for trying out =)
i'm reopening as we need to change the defaults (we would welcome a PR).

from rl-baselines3-zoo.

[Bug]: Nan Problems for SAC, TQC, for AntBulletEnv-v0, HalfCheetahBulletEnv-v0 about rl-baselines3-zoo HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent