Giter Site home page Giter Site logo

ibm / vsrl-framework Goto Github PK

View Code? Open in Web Editor NEW
53.0 10.0 13.0 1.97 MB

The Verifiably Safe Reinforcement Learning Framework

License: MIT License

Shell 0.17% Python 99.83%
reinforcement-learning safety-critical formal-verification formal-methods reinforcement-learning-algorithms reinforcement-learning-environments cyber-physical-systems keymaerax keymaera differential-dynamic-logic

vsrl-framework's Introduction

IBM Verifiably Safe Reinforcement Learning framework

This repository contains implementions of Verifiably Safe Reinforcement Learning (VSRL) algorthms. To learn more about the motivation for this framework, check out our web experience.

What is Verifiable Safety?

Safe reinforcement learning algorithms learn how to solve sequential decision making problems while continuously maintaining safety specifications (e.g., collision avoidance). These safety specifications are maintained throughout the entire training process.

Verifiable safety means that these safety constraints are backed by formal, computer-checked proofs. Formally verified safety constraints are important because getting safety constraints correct is hard.

For example, if a car at position (car.x, car.y) wants to avoid an obstacle at position (obs.x, obs.y), the safety constraint car.x != obs.x AND car.y != obs.y is not sufficient! The car must instead correctly compute its braking distance based on a dynamical model of the car and the obstacle, so that it starts braking with sufficient lead time to ensure the car comes to a complete stop before reaching an obstacle.

Getting these braking distance calculations right is non-trivial. Getting safety constraints right on more difficult control problems is quite difficult. For that reason, hybrid systems theorem provers can provide computer-checked proofs that safety constraints are correct.

Repository Structure

  • assets: PNG images used to render the environments
    • configs: config files for each environment that can be used to train an object detector
  • scripts: training scripts for the object detector and reinforcement learning
  • tests: tests for various components of VSRL
  • vsrl: the VSRL Python package
    • parser: parses for strings containing ODEs or similar mathematical expressions
    • rl: reinforcement learning agents and environments. We use the RL implementations from [rlpyt].
    • spaces: symbolic state spaces
    • symmap: for mapping from non-symbolic inputs (e.g. images) into a symbolic state space (e.g. object locations) over which safety constraints can be expressed
    • training: functions for training the detector / RL agents
    • utils: miscellaneous code
    • verifier: interfaces to formal methods tools that can be used to check the correctness of constraints, monitor modeling assumptions, etc.

Getting Started

Installation

If you want to run your models on GPUs, first follow the PyTorch installation guide to get the correct version of PyTorch for your CUDA version (otherwise, our dependencies include a CPU-only version). Then, install VSRL using pip:

git clone https://github.com/IBM/vsrl-framework.git
cd vsrl-framework
pip install .
# alternatively, pip install git+https://github.com/IBM/vsrl-framework.git

Environments

We provide three environments to test VSRL:

  • goal finding: the agent must avoid hazards and navigate to a a goal
  • robot vacuum: the agent must clean messes up without breaking vases and then return to a charging station
  • adaptive cruise control: the agent must follow a leader vehicle at close but safe safe distance

See the envs README for futher details which are common to all environments.

Here are sample observations from each environment (rescaled) to get a feel for the setup.

PMGF sample observation PM sample observation ACC sample observation

Object Detection

In order to enforce safety constraints, we need to extract the positions of all safety-critical objects from the environment. To minimize data labelling, we just require at least one image of each object and at least one background image. The locations of these images are specified in a TOML file like so:

backgrounds = ["background.png"]

[objects]
agent = ["agent.png"]
hazard = ["hazard.png"]
goal = ["goal.png"]

(the paths should be absolute or relative to the TOML file)

We generate a dataset of potential observations from these images on-the-fly as we train the object detector. To start training, call

python scripts/train_detector.py --config_path <path_to_your_config_file> --save_dir <directory_to_save_models>

E.g. for the ACC environment, you might call

python scripts/train_detector.py --config_path assets/configs/ACC.toml --save_dir ~/models

Then the saved model will be in ~/models/vsrl/detector_ACC_0/checkpoints/.

From within a script or notebook, you could instead do

from vsrl.training import train_center_track
config_path = ... # e.g. the path to "assets/configs/ACC.toml"
save_dir = ... # e.g. "/home/user/models"
model = train_center_track(config_path, save_dir)

Load in a saved model like this

from vsrl.symmap.detectors.center_track import CenterTrack

# replace this with your checkpoint
checkpoint_path = "/home/user/models/vsrl/detector_ACC_0/checkpoints/epoch=1.ckpt"
model = CenterTrack.load_from_checkpoint(checkpoint_path)

A config file for each of our environments is in assets/configs.

For more visually complex environments (e.g. in the real world), a good pre-trained object detector would be essential.

Training VSRL Agents

The scripts/train_rl.py provides a convienant way of running training agents using VSRL:

python scripts/train_rl.py --env_name ACC --pbar true

Verifying Constraints

Verification of constraints can be done using KeYmaera X and copied into Python. You can also run KeYmaera X from VSRL using VSRL the python wrapper in the verifier directory. Although there is an interface that allows direct use of controller monitors from KeYmaera X a strings, translating constraints into Python by hand is often necessary.

Citing VSRL

The following BibTeX provides a canonical citation for VSRL:

@inproceedings{VSRL,
  author    = {Nathan Hunt and
               Nathan Fulton and
               Sara Magliacane and
               Trong Nghia Hoang and
               Subhro Das and
               Armando Solar{-}Lezama},
  editor    = {Sergiy Bogomolov and
               Rapha{\"{e}}l M. Jungers},
  title     = {Verifiably safe exploration for end-to-end reinforcement learning},
  booktitle = {{HSCC} '21: 24th {ACM} International Conference on Hybrid Systems:
               Computation and Control, Nashville, Tennessee, May 19-21, 2021},
  pages     = {14:1--14:11},
  publisher = {{ACM}},
  year      = {2021},
  url       = {https://doi.org/10.1145/3447928.3456653},
  doi       = {10.1145/3447928.3456653},
  timestamp = {Wed, 19 May 2021 15:10:46 +0200},
  biburl    = {https://dblp.org/rec/conf/hybrid/HuntFMHDS21.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

The main components of VSRL are additionally described in several papers:

Key Contributors

License

This project is licensed under the terms of the MIT License. See license.txt for details.

Contact

vsrl-framework's People

Contributors

diego-plan9 avatar imgbotapp avatar neighthan avatar nrfulton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vsrl-framework's Issues

Setup.py fails to install and missing dependencies

System Information

  • Python 3.7.5
  • Linux (debian)
  • 5.4.0-64-generic

Describe the current behavior
Running python setup.py install in a virtual environment fails to install. This is due to errors about install pyrl and version problems with pillow and chardet. After fixing the aforementioned issues, tests fail because of missing pytest dependency.

Describe the expected behavior
Creating a virtual environment and running python setup.py install should work out of the box with no errors. Tests should also work without getting an issue about missing dependency.

Standalone code to reproduce the issue

virtualenv --clear --python=python3.7 venv
source venv/bin/activate
python setup.py install

Below is my fix for the setup.py

from distutils.core import setup

from setuptools import find_packages

opencv_pkg = ""
if "DISPLAY" not in os.environ.keys():
    opencv_pkg = "opencv-python-headless"
else:
    opencv_pkg = "opencv-python"

setup(
    name="vsrl",
    version="0.0.1",
    description="Visceral: A Framework for Verifiably Safe Reinforcement Learning",
    author="IBM Research",
    author_email="[email protected]",
    url="https://visceral.safelearning.ai",
    packages=find_packages(),
    install_requires=[
        "scipy",
        "numpy",
        "torch",
        "pillow==7.2.0",
        "chardet==3.0.4",
        opencv_pkg,
        "pytorch_lightning",
        "comet_ml",
        "psutil",
        "torchvision",
        "parsimonious",
        "matplotlib",
        "portion",
        "toml",
        "auto-argparse",
        "gym",
        "pytest",
    ],
    extras_require={"dev": ["pytest", "pytest-cov"]},
    dependency_links=["http://github.com/astooke/rlpyt/tarball/master"]
)

Script for VSRL Agents not working

I am trying to use the framework, but unfortunately the provided scripts do not work. I'm not sure if this is an issue on my end, or if it may be because of unspecified dependency versions. I would be really happy for some help.

Expected Behavior

Running the scripts/train_rl.py should work. The error I get is the following.

image

Possible Solution

Add a requirements.txt file

Steps to Reproduce

As stated in the README.

python scripts/train_rl.py --env_name ACC --pbar true

P.S. it is required to specify the pbar parameter as true/false. The README does not state that.

Context (Environment)

Branch - master
Commit hash - 93d85dd
Environment - macOS

Package Versions
  • absl-py==0.11.0
  • attrs==20.2.0
  • auto-argparse==0.0.7
  • cachetools==4.1.1
  • certifi==2020.6.20
  • chardet==3.0.4
  • cloudpickle==1.6.0
  • comet-ml==3.2.5
  • configobj==5.0.6
  • cycler==0.10.0
  • dataclasses==0.6
  • dulwich==0.20.6
  • everett==1.0.3
  • fsspec==0.8.4
  • future==0.18.2
  • google-auth==1.22.1
  • google-auth-oauthlib==0.4.1
  • grpcio==1.33.1
  • gym==0.17.3
  • idna==2.10
  • jsonschema==3.2.0
  • kiwisolver==1.3.0
  • Markdown==3.3.3
  • matplotlib==3.3.2
  • netifaces==0.10.9
  • numpy==1.19.2
  • nvidia-ml-py3==7.352.0
  • oauthlib==3.1.0
  • opencv-python==4.4.0.44
  • parsimonious==0.8.1
  • Pillow==8.0.1
  • portion==2.1.3
  • protobuf==3.13.0
  • psutil==5.7.3
  • pyasn1==0.4.8
  • pyasn1-modules==0.2.8
  • pyglet==1.5.0
  • pyparsing==2.4.7
  • pyrsistent==0.17.3
  • python-dateutil==2.8.1
  • pytorch-lightning==1.0.4
  • PyYAML==5.3.1
  • requests==2.24.0
  • requests-oauthlib==1.3.0
  • rlpyt @ git+https://github.com/astooke/rlpyt.git@f04f23db1eb7b5915d88401fca67869968a07a37
  • rsa==4.6
  • scipy==1.5.3
  • six==1.15.0
  • sortedcontainers==2.2.2
  • tensorboard==2.3.0
  • tensorboard-plugin-wit==1.7.0
  • toml==0.10.1
  • torch==1.7.0
  • torchvision==0.8.1
  • tqdm==4.51.0
  • typing-extensions==3.7.4.3
  • urllib3==1.25.11
  • websocket-client==0.57.0
  • Werkzeug==1.0.1
  • wrapt==1.12.1
  • wurlitzer==2.0.1

Add tests for examples

We should have tests covering the following:

  • All RL libraries that we claim to support
  • All of the environments that ship with VSRL
  • All of the examples described in the README, and in any additional future documentation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.