x35f / unstable_baselines Goto Github PK
View Code? Open in Web Editor NEWRe-implementations of SOTA RL algorithms.
Re-implementations of SOTA RL algorithms.
line108是不是应该加上model_rollout_steps = new_model_rollout_steps?
如果需要加上,加上之后mbpo性能是提升还是下降?
I use Anaconda in win 11 to run the code. I follow the instruction, and when I run the command, error happens.
Command:
cd unstable_baselines/baselines/sac
python3 main.py configs/Ant-v3.py --gpu 0
Besides, I try to run DQN+SpaceInvader Environment, the error is:
It seems the 41 line of util doesn't fit, what should I do to fix the error? Or I should run the library on LINUX?
I'd like to do the following but instead of SB3 I'd like to plug in unstable baselines. Is there a quick start guide or documentation somewhere that could help me get started?
import gym
from stable_baselines3 import PPO
# create the environment and wrap it in a vectorized environment
env = gym.make('MyEnv')
env = DummyVecEnv([lambda: env])
# create the PPO agent and train it on the environment
model = PPO('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=10000)
# test the trained agent
obs = env.reset()
for i in range(100):
action, _states = model.predict(obs)
obs, rewards, dones, info = env.step(action)
env.render()
if dones:
break
env.close()
I see that the current performance curve is based on v3 tasks. However, the config file in the model based RL (mbpo) contains many v2 tasks. Are the v2 tasks in the mbpo same as v3 tasks?
should be
env = gym.make(env_name, **kwargs)
for consistency.
git clone --recurse-submodules https://github.com/x35f/unstable_baselines.git
cd unstable_baselines
conda env create -f env.yaml
conda activate rl_base
pip install -e .
cd unstable_baselines
PS C:\dev\unstable_baselines> conda env create -f env.yaml
Retrieving notices: ...working... done
Collecting package metadata (repodata.json): done
Solving environment: failed
ResolvePackageNotFound:
- mujoco=2.1.2
PS C:\dev\unstable_baselines> conda activate rl_base
EnvironmentNameNotFound: Could not find conda environment: rl_base
You can list all discoverable environments with `conda info --envs`.
Hi all, this is very cool stuff. I especially like that there is an MBPO implementation.
Would you be interested in using wandb to contribute experiment runs to openrlbenchmark
utilities? It provides more plots and analysis options and could make comparing with other libraries much easier. See openrlbenchmark/openrlbenchmark#22 for an example.
To use openrlbenchmark
, all you need to do is to track metrics in wandb with the x-axis being global_step
. If you use tensorboard, you just need to turn on the wandb tensorboard integration (example). Then you can use openrlbenchmark
to pull runs from wandb and generate plots like in openrlbenchmark/openrlbenchmark#22
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.