A version of of model predictive path integral control (MPPI) that allows for the implementation of adaptive importance sampling (AIS) algorithms into the original importance sampling step. Model predictive optimized path integral control (MPOPI) is more sample efficient than MPPI achieving better performance with fewer samples. A video of MPPI and MPOPI controlling 3 cars side by side for comparison can be seen here.
The addition of AIS enables the algorithm to use a better set of samples for the calculation of the control. A depiction of how the samples evolve over iterations can be seen in the following gif.
Versions of MPPI and MPOPI implemented
- Non-Iterative MPPI and GMPPI
- MPOPI
- PMC (
:pmcmppi
): population Monte Carlo algorithm with one distribution3 - μ-AIS (
:μaismppi
): mean only moment matching AIS algorithm - μΣ-AIS (
:μΣaismppi
): mean and covariance moment matching AIS algorithm similar to Mixture-PMC4 - CE (
:cemppi
): cross-entropy method56 - CMA (
:cmamppi
): covariance matrix adaptation evolutionary strategy57
- PMC (
For implementation details reference the source code. For simulation parameters used, reference the wiki.
Use the julia package manager to add the MPOPIS module:
] add https://github.com/sisl/MOPOPIS
Using the built in example to simulate the MountainCar envrironment:
using MPOPIS
simulate_mountaincar(policy_type=:cemppi, num_trials=5)
Simulate the Car Racing environment and save a gif:
simulate_car_racing(save_gif=true)
Also plotting the trajectories and simulating multiple cars
simulate_car_racing(num_cars=3, plot_traj=true, save_gif=true)
Footnotes
-
Grady Williams, Nolan Wagener, Brian Goldfain, Paul Drews, James M. Rehg, Byron Boots, and Evangelos A. Theodorou. Information theoretic MPC for model-based reinforcement learning. Proceedings - IEEE International Conference on Robotics and Automation, 2017. doi: 10.1109/ICRA.2017.7989202. ↩
-
Grady Robert Williams. Model predictive path integral control: Theoretical foundations and applications to autonomous driving. PhD thesis, Georgia Institute of Technology, 2019. ↩
-
O Capp´e, A Guillin, JMMarin, and C P Robert. Population Monte Carlo. Journal of Computational and Graphical Statistics, 13:907–929, 2004. doi: 10.1198/106186004X12803. ↩
-
Olivier Capp´e, Randal Douc, Arnaud Guillin, Jean Michel Marin, and Christian P. Robert. Adaptive importance sampling in general mixture classes. Statistics and Computing, 18, 2008. doi: 10.1007/s11222-008-9059-x. ↩
-
Mykel J. Kochenderfer and Tim A. Wheeler. Algorithms for Optimization. MIT Press, 2019. ↩ ↩2
-
Reuven Y Rubinstein and Dirk P Kroese. The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-Carlo Simulation (Information Science and Statistics). Springer-Verlag, 2004. ↩
-
Yousef El-Laham, Victor Elvira, and Monica F. Bugallo. Robust covariance adaptation in adaptive importance sampling. IEEE Signal Processing Letters, 25, 2018. doi: 10.1109/LSP.2018.2841641. ↩