MPOPIS (Model Predictive Optimized Path Integral Strategies)

A version of of model predictive path integral control (MPPI) that allows for the implementation of adaptive importance sampling (AIS) algorithms into the original importance sampling step. Model predictive optimized path integral control (MPOPI) is more sample efficient than MPPI achieving better performance with fewer samples. A video of MPPI and MPOPI controlling 3 cars side by side for comparison can be seen here.

The addition of AIS enables the algorithm to use a better set of samples for the calculation of the control. A depiction of how the samples evolve over iterations can be seen in the following gif.

MPOPI (CE) 150 Samples, 10 Iterations

Policy Options

Versions of MPPI and MPOPI implemented

Non-Iterative MPPI and GMPPI
- MPPI (:mppi): Model Predictive Path Integral Control¹²
- GMPPI (:gmppi): generalized version of MPPI, treating the control sequence as one control vector with a combined covariance matrix
MPOPI
- PMC (:pmcmppi): population Monte Carlo algorithm with one distribution³
- μ-AIS (:μaismppi): mean only moment matching AIS algorithm
- μΣ-AIS (:μΣaismppi): mean and covariance moment matching AIS algorithm similar to Mixture-PMC⁴
- CE (:cemppi): cross-entropy method⁵⁶
- CMA (:cmamppi): covariance matrix adaptation evolutionary strategy⁵⁷

For implementation details reference the source code. For simulation parameters used, reference the wiki.

Getting Started

Use the julia package manager to add the MPOPIS module:

] add https://github.com/sisl/MOPOPIS

Using the built in example to simulate the MountainCar envrironment:

using MPOPIS
simulate_mountaincar(policy_type=:cemppi, num_trials=5)

Simulate the Car Racing environment and save a gif:

simulate_car_racing(save_gif=true)

Also plotting the trajectories and simulating multiple cars

simulate_car_racing(num_cars=3, plot_traj=true, save_gif=true)

Grady Williams, Nolan Wagener, Brian Goldfain, Paul Drews, James M. Rehg, Byron Boots, and Evangelos A. Theodorou. Information theoretic MPC for model-based reinforcement learning. Proceedings - IEEE International Conference on Robotics and Automation, 2017. doi: 10.1109/ICRA.2017.7989202. ↩
Grady Robert Williams. Model predictive path integral control: Theoretical foundations and applications to autonomous driving. PhD thesis, Georgia Institute of Technology, 2019. ↩
O Capp´e, A Guillin, JMMarin, and C P Robert. Population Monte Carlo. Journal of Computational and Graphical Statistics, 13:907–929, 2004. doi: 10.1198/106186004X12803. ↩
Olivier Capp´e, Randal Douc, Arnaud Guillin, Jean Michel Marin, and Christian P. Robert. Adaptive importance sampling in general mixture classes. Statistics and Computing, 18, 2008. doi: 10.1007/s11222-008-9059-x. ↩
Mykel J. Kochenderfer and Tim A. Wheeler. Algorithms for Optimization. MIT Press, 2019. ↩ ↩²
Reuven Y Rubinstein and Dirk P Kroese. The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-Carlo Simulation (Information Science and Statistics). Springer-Verlag, 2004. ↩
Yousef El-Laham, Victor Elvira, and Monica F. Bugallo. Robust covariance adaptation in adaptive importance sampling. IEEE Signal Processing Letters, 25, 2018. doi: 10.1109/LSP.2018.2841641. ↩

haxhimitsu / mpopis Goto Github PK

mpopis's Introduction

MPOPIS (Model Predictive Optimized Path Integral Strategies)

MPOPI (CE) 150 Samples, 10 Iterations

Policy Options

Getting Started

mpopis's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

haxhimitsu / mpopis Goto Github PK

mpopis's Introduction

MPOPIS (Model Predictive Optimized Path Integral Strategies)

MPOPI (CE) 150 Samples, 10 Iterations

Policy Options

Getting Started

Footnotes

mpopis's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org