Light

ferreirafabio / mppi_pendulum Goto Github PK

The reimplementation of Model Predictive Path Integral (MPPI) from the paper "Information Theoretic MPC for Model-Based Reinforcement Learning" (Williams et al., 2017) for the pendulum OpenAI Gym environment

Python 100.00%

mppi_pendulum's Introduction

I am a PhD student at the Machine Learning Group in Freiburg under the supervision of Frank Hutter.

Current project: Beyond Random Augmentations: Pretraining with Hard Views (under review at NeurIPS 2024)

Past projects:

Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How (ICLR 2024, oral)
Zero-Shot AutoML with Pretrained Models (ICML 2022)
Learning Environments for Reinforcement Learning (ICLR 2022)

mppi_pendulum's People

Contributors

Stargazers

Watchers

Forkers

vvrs knoxantropicen williamd4112 maximlakin haochihlin lemonpi feifanrensheng swl017 tmparticle nikhil-garg caiyishuai pratik-canvas lu-tju timeoptimal

mppi_pendulum's Issues

Questions on the implementation

Hello,
Thanks for providing this implementation!

I have a quick question on the implementation though.
In this line, you are just adding up costs throughout trajectories while Algorithm 2 in the paper adds extra term which is "lambda * u^{T}{t-1} \Sigma^{-1}*\epsilon^{k}{t-1}".

Sample from approximate dynamics

Are there any plans to extend this to approximated dynamics (e.g. with a NN) and using importance sampling instead of sampling trajectories directly from the environment?
(replace __init__ env arg with a dynamics arg, then take in env for just the control method for actual stepping)

That would actually match the contributions from the 2017 paper and make it more broadly applicable. I would like to use this in an environment where I can't reset the state of the simulator, so trajectories have to be generated with the model.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.