Documentation | Implemented Algorithms | Installation | Getting Started | License

OmniSafe

This library is currently under heavy development - if you have suggestions on the API or use-cases you'd like to be covered, please open an github issue or reach out. We'd love to hear about how you're using the library.

OmniSafe is a comprehensive and reliable benchmark for safe reinforcement learning, covering a multitude of SafeRL domains and delivering a new suite of testing environments.

The simulation environment around OmniSafe and a series of reliable algorithm implementations will help the SafeRL research community easier to replicate and improve the excellent work already done while also helping to facilitate the validation of new ideas and new algorithms.

Implemented Algorithms
- Newly Published in 2022
- List of Algorithms
Installation
Getting Started
The OmniSafe Team
License

Implemented Algorithms

The supported interface algorithms currently include:

Newly Published in 2022

[AAAI 2023] Augmented Proximal Policy Optimization for Safe Reinforcement Learning (APPO) The original author of the paper contributed code
[NeurIPS 2022] Constrained Update Projection Approach to Safe Policy Optimization (CUP) The original author of the paper contributed code
[NeurIPS 2022] Effects of Safety State Augmentation on Safe Exploration (Simmer)
[NeurIPS 2022] Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm
[ICML 2022] Sauté RL: Almost Surely Safe Reinforcement Learning Using State Augmentation (SauteRL)
[ICML 2022] Constrained Variational Policy Optimization for Safe Reinforcement Learning (CVPO)
[IJCAI 2022] Penalized Proximal Policy Optimization for Safe Reinforcement Learning The original author of the paper contributed code
[ICLR 2022] Constrained Policy Optimization via Bayesian World Models (LA-MBDA)
[AAAI 2022] Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning (CAP)

List of Algorithms

Installation

Prerequisites

OmniSafe requires Python 3.8+ and PyTorch 1.10+.

Install from source

# Clone the repo
git clone https://github.com/PKU-MARL/omnisafe
cd omnisafe

# Create a conda environment
conda create -n omnisafe python=3.8
conda activate omnisafe

# Install omnisafe
pip install -e .

Examples

cd examples
python train_policy.py --algo PPOLag --env-id SafetyPointGoal1-v0 --parallel 1 --total-steps 1024000 --device cpu --vector-env-nums 1 --torch-threads 1

algo:

Type	Name
`Base-On-Policy`	`PolicyGradient, PPO` `NaturalPG, TRPO`
`Base-Off-Policy`	`DDPG, TD3, SAC`
`Naive Lagrange`	`RCPO, PPOLag, TRPOLag` `DDPGLag, TD3Lag, SACLag`
`PID Lagrange`	`CPPOPid, TRPOPid`
`First Order`	`FOCOPS, CUP`
`Second Order`	`SDDPG, CPO, PCPO`
`Saute RL`	`PPOSaute, PPOLagSaute`
`Simmer RL`	`PPOSimmerQ, PPOSimmerPid` `PPOLagSimmerQ, PPOLagSimmerPid`
`EarlyTerminated`	`PPOEarlyTerminated` `PPOLagEarlyTerminated`
`Model-Based`	`CAP, MBPPOLag, SafeLOOP`

env-id: Environment id in Safety Gymnasium, here a list of envs that safety-gymnasium supports.

Category	Task	Agent	Example
Safe Navigation	Goal[012]	Point, Car, Racecar, Ant	SafetyPointGoal1-v0
	Button[012]
	Push[012]
	Circle[012]
Safe Velocity	Velocity	HalfCheetah, Hopper, Swimmer, Walker2d, Ant, Humanoid	SafetyHumanoidVelocity-v4

More information about environments, please refer to Safety Gymnasium

parallel: Number of parallels

Getting Started

1. Run Agent from preset yaml file

import omnisafe


env_id = 'SafetyPointGoal1-v0'
custom_cfgs = {
    'train_cfgs': {
        'total_steps': 1024000,
        'vector_env_nums': 1,
        '--parallel': 1,
    },
    'algo_cfgs': {
        'update_cycle': 2048,
        'update_iters': 1,
    },
    'logger_cfgs': {
        'use_wandb': False,
    },
}

agent = omnisafe.Agent('PPOLag', env_id, custom_cfgs=custom_cfgs)
agent.learn()

3. Run Agent from custom terminal config

You can also run agent from custom terminal config. You can set any config in corresponding yaml file.

For example, you can run PPOLag agent on SafetyPointGoal1-v0 environment with total_steps=1024000, vector_env_nums=1 and parallel=1 by:

cd examples
python train_policy.py --algo PPOLag --env-id SafetyPointGoal1-v0 --parallel 1 --total-steps 1024000 --device cpu --vector-env-nums 1 --torch-threads 1

4. Evalutate Saved Policy

import os

import omnisafe


# Just fill your experiment's log directory in here.
# Such as: ~/omnisafe/runs/SafetyPointGoal1-v0/CPO/seed-000-2022-12-25_14-45-05
LOG_DIR = ''

evaluator = omnisafe.Evaluator()
for item in os.scandir(os.path.join(LOG_DIR, 'torch_save')):
    if item.is_file() and item.name.split('.')[-1] == 'pt':
        evaluator.load_saved_model(save_dir=LOG_DIR, model_name=item.name)
        evaluator.render(num_episode=10, camera_name='track', width=256, height=256)

The OmniSafe Team

OmniSafe is currently maintained by Borong Zhang, Jiayi Zhou, JTao Dai, Weidong Huang, Ruiyang Sun ,Xuehai Pan, Jiamg Ji and under the instruction of Prof. Yaodong Yang. If you have any question in the process of using omnisafe, don't hesitate to ask your question in the GitHub issue page, we will reply you in 2-3 working days.

License

OmniSafe is released under Apache License 2.0.

coder-drinker / omnisafe Goto Github PK

omnisafe's Introduction

OmniSafe

Table of Contents

Implemented Algorithms

Newly Published in 2022

List of Algorithms

On-Policy Safe

Off-Policy Safe

Model-Based Safe

Offline Safe

Others

Installation

Prerequisites

Install from source

Examples

Getting Started

1. Run Agent from preset yaml file

3. Run Agent from custom terminal config

4. Evalutate Saved Policy

The OmniSafe Team

License

omnisafe's People

Contributors

Recommend Projects

Recommend Topics

Recommend Org