x35f / unstable_baselines Goto Github PK

View Code? Open in Web Editor NEW

119.0 4.0 12.0 15.26 MB

Re-implementations of SOTA RL algorithms.

Python 99.19% Ruby 0.77% Shell 0.04%

baselines meta-rl model-based-rl pytorch reinforcement-learning

unstable_baselines's People

Contributors

Stargazers

Watchers

Forkers

mimeku shenbiachao typoverflow pilgrimygy lamda-rl ltsure stepneverstop rl-code-lib ruifeng-chen qzj-debug pickxiguapi zxq-0058

unstable_baselines's Issues

mbpo train.py model_rollout_steps

line108是不是应该加上model_rollout_steps = new_model_rollout_steps？
如果需要加上，加上之后mbpo性能是提升还是下降？

Errors in running the library in Win11+Anaconda

I use Anaconda in win 11 to run the code. I follow the instruction, and when I run the command, error happens.

Command:

cd unstable_baselines/baselines/sac
python3 main.py configs/Ant-v3.py --gpu 0

Besides, I try to run DQN+SpaceInvader Environment, the error is:

It seems the 41 line of util doesn't fit, what should I do to fix the error? Or I should run the library on LINUX?

OpenAi gym integration

I'd like to do the following but instead of SB3 I'd like to plug in unstable baselines. Is there a quick start guide or documentation somewhere that could help me get started?

import gym
from stable_baselines3 import PPO

# create the environment and wrap it in a vectorized environment
env = gym.make('MyEnv')
env = DummyVecEnv([lambda: env])

# create the PPO agent and train it on the environment
model = PPO('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=10000)

# test the trained agent
obs = env.reset()
for i in range(100):
    action, _states = model.predict(obs)
    obs, rewards, dones, info = env.step(action)
    env.render()
    if dones:
        break
env.close()

Are all the tasks v3?

I see that the current performance curve is based on v3 tasks. However, the config file in the model based RL (mbpo) contains many v2 tasks. Are the v2 tasks in the mbpo same as v3 tasks?

[Bug]: forget to pass kwargs to gym.make

unstable_baselines/unstable_baselines/common/env_wrapper.py

Line 50 in b3648f5

env = gym.make(env_name)

should be

env = gym.make(env_name, **kwargs)

for consistency.

Installation error on env creation

git clone --recurse-submodules https://github.com/x35f/unstable_baselines.git
cd unstable_baselines
conda env create -f env.yaml 
conda activate rl_base
pip install -e .

cd unstable_baselines
PS C:\dev\unstable_baselines> conda env create -f env.yaml
Retrieving notices: ...working... done
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:
  - mujoco=2.1.2

PS C:\dev\unstable_baselines> conda activate rl_base

EnvironmentNameNotFound: Could not find conda environment: rl_base
You can list all discoverable environments with `conda info --envs`.

`openrlbenchmark` integration

Hi all, this is very cool stuff. I especially like that there is an MBPO implementation.

Would you be interested in using wandb to contribute experiment runs to openrlbenchmark utilities? It provides more plots and analysis options and could make comparing with other libraries much easier. See openrlbenchmark/openrlbenchmark#22 for an example.

To use openrlbenchmark, all you need to do is to track metrics in wandb with the x-axis being global_step. If you use tensorboard, you just need to turn on the wandb tensorboard integration (example). Then you can use openrlbenchmark to pull runs from wandb and generate plots like in openrlbenchmark/openrlbenchmark#22

x35f / unstable_baselines Goto Github PK

unstable_baselines's People

Contributors

Stargazers

Watchers

Forkers

unstable_baselines's Issues

mbpo train.py model_rollout_steps

Errors in running the library in Win11+Anaconda

OpenAi gym integration

Are all the tasks v3?

[Bug]: forget to pass kwargs to gym.make

Installation error on env creation

`openrlbenchmark` integration

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent