Giter Site home page Giter Site logo

diff-control's Introduction

Diff-Control: A Stateful Diffusion-based Policy for Imitation Learning

This repo is the official implementation of IROS 2024 paper "Diff-Control: A Stateful Diffusion-based Policy for Imitation Learning" by Xiao Liu, Yifan Zhou, Fabian Weigend, Shubham Sonawani, Shuhei Ikemoto, and Heni Ben Amor. The project website is here.

While imitation learning provides a simple and effective framework for policy learning, acquiring consistent action during robot execution remains a challenging task. Existing approaches primarily focus on either modifying the action representation at data curation stage or altering the model itself, both of which do not fully address the scalability of consistent action generation. To overcome this limitation, we introduce the Diff-Control policy, which utilizes a diffusion-based model to learn action representation from a state-space modeling viewpoint. We demonstrate that diffusion-based policies can acquire statefulness through a Bayesian formulation facilitated by ControlNet, leading to improved robustness and success rates. Our experimental results demonstrate the significance of incorporating action statefulness in policy learning, where Diff-Control shows improved performance across various tasks. Specifically, Diff-Control achieves an average success rate of 72% and 84% on stateful and dynamic tasks, respectively. Notably, Diff-Control also shows consistent performance in the presence of perturbations, outperforming other state-of-the-art methods that falter under similar conditions.

We summarize our contributions as:

  1. Add Bayesian guarantee for diffusion-based polices using ControlNet structure as a transition model to ensure consistent action generation.
  2. Show the advantage of Diff-Control in performing dynamic and temporal sensitive tasks with at least 10% and 48% success rate improvements.
  3. Empirically demonstrate Diff-Control Policy can perform on a wide range of tasks including high-precision tasks with at least 31% success rate improvement.
  4. Diff-Control policy exhibits notable precision and robustness against perturbations, achieving at least a 30% higher success rate compared to state-of-the-art methods.

Getting Started

We provide implementation using Pytorch. Clone the repo git clone https://github.com/ir-lab/Diff-Control.git and then there are two options for running the code.

1. Python Scripts

Intall PyTorch and then set up the environment using pip install -r requirements.txt. Make sure to have corresponding libraries and dependencies installed on your local environment, i.e., we use PyTorch 1.11.0 with cuda11.3.

For training or testing, Go to ./Diff-Control and then Run

python train.py --config ./config/duck_diffusion.yaml

or run

python train.py --config ./config/duck_controlnet.yaml

Note that Diff-Control assume you have already trained a diffusion model as the base policy.

2. docker workflow

Edit the conf.sh file to set the environment variables used to start the docker containers.

IMAGE_TAG=  # unique tag to be used for the docker image.
CONTAINER_NAME=UR5  # name of the docker container.
DATASET_PATH=/home/xiao/datasets/  # Dataset path on the host machine.
CUDA_VISIBLE_DEVICES=0  # comma-separated list of GPU's to set visible.

Build the docker image by running ./build/build.sh.

Training or testing

Create or a modify a yaml file found in ./Diff-Control/config/duck_controlnet.yaml, and set the mode parameter to perform the training or testing routine.

mode:
    mode: 'train'  # 'train' | 'test'

Run the training and test script using the bash file ./run_filter.sh $CONFIG_FILE where $CONFIG_FILE is the path to the config file.

`./run_filter.sh ./config/duck_controlnet.yaml`

View the logs with docker logs -f $CONTAINER_NAME

Tensorboard

Use the docker logs to copy the tensorboard link to a browser

docker logs -f $CONTAINER_NAME-tensorboard

Results

We conduct a series of experiments to evaluate the efficacy of proposed policy. Specifically, we aim to answer the following questions:

  1. Can the Diff-Control policy demonstrate generalization capabilities across diverse tasks?
  2. To what extent does the Diff-Control policy outperform the current state-of-the-art methods in terms of overall performance?
  3. What are the distinguishing characteristics and benefits of utilizing a stateful policy compared to a non-stateful policy?

Diverse Tasks

The top two rows depict the language-conditioned kitchen task, where the Diff-Control Policy successfully generates actions in alignment with the given language command and consistently executes these actions. The third row shows a successful duck scooping experiment. The last row displays one drum task result.

Findings

  1. Diff-Control policy exhibits the ability to recover from perturbations.
  2. Diff-Control policy demonstrated superior performance and does not show the tendency to overfit on idle actions.
  3. Diff-Control is robust against visual perturbation such as occlusion.
  4. Diff-Control as a stateful policy is beneficial for robot learning periodic behaviors.

Datasets

The datasets used in the experiments can be found in this link.

Citation

  • Please cite the paper if you used any materials from this repo, Thanks.
@article{liudiff,
        title={Diff-Control: A Stateful Diffusion-based Policy for Imitation Learning},
        author={Liu, Xiao and Zhou, Yifan and Weigend, Fabian and Sonawani, Shubham and Ikemoto, Shuhei and Amor, Heni Ben}
      }
@article{liu2024enabling,
  title={Enabling Stateful Behaviors for Diffusion-based Policy Learning},
  author={Liu, Xiao and Weigend, Fabian and Zhou, Yifan and Amor, Heni Ben},
  journal={arXiv preprint arXiv:2404.12539},
  year={2024}
}

diff-control's People

Contributors

liuxiao1468 avatar yfzhoucs avatar

Stargazers

 avatar Yuhang Zhong avatar faded avatar Nahyun Kwon avatar cogitoErgoSum avatar 林海涛 | Haitao Lin avatar  avatar joonhyung-lee avatar Yoon, Seungje avatar Nimolty avatar

Watchers

Trevor Barron avatar Simon avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.