Giter Site home page Giter Site logo

gibo-neurips-2021 / gibo Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 71.76 MB

This repository contains the code for the NeurIPS 2021 submission "Local policy search with Bayesian optimization".

Jupyter Notebook 69.32% Python 30.68%
reinforcement-learning bayesian-optimization active-learning policy-optimization policy-gradient gym mujoco

gibo's Introduction

Local policy search with Bayesian optimization

The algorithms implemented in this repo can solve black-box optimization problems. Black-box optimization refers to the general setup of optimizing an unknown function where only its evaluations are available.

We introduce a new method that enables us to employ local gradient methods for black-box optimization by active sampling for efficient gradient estimation in the Bayesian optimization framework.

Code of the repo

  • optimizers: The optimizers implemented can be applied to black-box functions. Implemented are random search, vanilla Bayesian optimization, CMA-ES and the proposed method Gradient Information with BO (GIBO).
  • model: A Gaussian process model with a squared-exponential kernel that also supplies the Jacobian.
  • policy parameterization: Multilayer perceptrones as policy parameterization for solving reinforcement learning problems.
  • environment api: Interface for interactions with reinforcement learning environments of OpenAI Gym.
  • acquisition function: Custom acquisition function for gradient information.
  • loop: Brings together all parts necessary for an optimization loop.

Installation

Our GIBO implementation relies on mujoco-py 0.5.7 with MuJoCo Pro version 1.31. To install MuJoCo follow the instructions here: https://github.com/openai/mujoco-py. To run Linear Quadratic Regulator experiments, follow the instruction under gym-lqr.

Pip

Into an environment with python 3.8.5 you can install all needed packages with

pip install -r requirements.txt

Conda

Or you can create an anaconda environment called gibo using

conda env create -f environment.yaml
conda activate gibo

Pipenv

Or you can install and activate and environment via pipenv

pipenv install
pipenv shell

Usage

For experiments with synthetic test functions and reinforcement learning problems (e.g. MuJoCo) a command-line interface is supplied.

Synthetic Test Functions

Run

First generate the needed data for the synthetic test functions.

python generate_data_synthetic_functions.py -c ./configs/synthetic_experiment/generate_data_default.yaml

Afterwards you can run for instance our method Bayesian gradient ascent (bga) on these test functions.

python run_synthetic_experiment.py -c ./configs/synthetic_experiment/bga_default.yaml -cd ./configs/synthetic_experiment/generate_data_default.yaml

Evaluate

Evaluation of the synthetic experiments and reproduction of the paper's figures can be done with the notebook evaluation synthetic experiment.

Reproduce Paper Results

To reproduce the results of the paper use these config files.

Reinforcement Learning

Run

Run the MuJoCo swimmer environment with the proposed method Bayesian gradient ascent (bga).

python run_rl_experiment.py -c ./configs/rl_experiment/bga_default.yaml

Evaluate

Create plot to compare rewards over function calls for different optimizers (in this case bga with random search).

python evaluation_rl_experiment.py -path path_to_image/image.pdf -cs ./configs/rl_experiment/bga_default.yaml ./configs/rl_experiment/rs_default.yaml 

Or use the notebook evaluation rl experiment to reproduce the figures of the paper.

Reproduce Paper Results

To reproduce the results of the paper, use the linked config files for cartpole, swimmer, and hopper.

Linear Quadratic Regulator

To reproduce the results and plots of the paper run the code in the notebook lqr_experiment.

gibo's People

Contributors

gibo-neurips-2021 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.