Giter Site home page Giter Site logo

sharathbhushan / deposit Goto Github PK

View Code? Open in Web Editor NEW

This project forked from vita-epfl/deposit

0.0 0.0 0.0 1.22 MB

[ICRA 2023] Official implementation of "A generic diffusion-based approach for 3D human pose prediction in the wild".

License: GNU Affero General Public License v3.0

Python 100.00%

deposit's Introduction

A generic diffusion-based approach for
3D human pose prediction in the wild

Saeed Saadatnejad, Ali Rasekh, Mohammadreza Mofayezi, Yasamin Medghalchi, Sara Rajabzadeh, Taylor Mordan, Alexandre Alahi

International Conference on Robotics and Automation (ICRA), 2023

[arXiv] [video] [poster]

Abstract

Predicting 3D human poses in real-world scenarios, also known as human pose forecasting, is inevitably subject to noisy inputs arising from inaccurate 3D pose estimations and occlusions. To address these challenges, we propose a diffusion-based approach that can predict given noisy observations. We frame the prediction task as a denoising problem, where both observation and prediction are considered as a single sequence containing missing elements (whether in the observation or prediction horizon). All missing elements are treated as noise and denoised with our conditional diffusion model. To better handle long-term forecasting horizon, we present a temporal cascaded diffusion model. We demonstrate the benefits of our approach on four publicly available datasets (Human3.6M, HumanEva-I, AMASS, and 3DPW), outperforming the state-of-the-art. Additionally, we show that our framework is generic enough to improve any 3D pose prediction model as a pre-processing step to repair their inputs and a post-processing step to refine their outputs.


Getting started

Requirements

The code requires Python 3.7 or later. The file requirements.txt contains the full list of required Python modules.

pip install -r requirements.txt

Data

Human3.6M in exponential map can be downloaded from here.

Directory structure:

H3.6m
|-- S1
|-- S5
|-- S6
|-- ...
|-- S11

AMASS and 3DPW from their official websites.

Specify the data path with data_dir argument.

Training and Testing

Human3.6M

You need to train a short-term and long-term model using these commands:

python main_tcd_h36m.py --mode train --epochs 50 --data all --joints 22 --input_n 50 --output_n 5 --data_dir data_dir --output_dir model_s
python main_tcd_h36m.py --mode train --epochs 50 --data all --joints 22 --input_n 55 --output_n 20 --data_dir data_dir --output_dir model_l

For evaluating the TCD model you can run the following command. Specify the short-term and long-term model checkpoints directory with --model_s and --model_l arguments.

python main_tcd_h36m.py --mode test --data all --joints 22 --input_n 50 --output_n 25 --data_dir data_dir --model_s model_s --model_l model_l --output_dir model_l

The results will be saved in a csv file in the output directory.

AMASS and 3DPW

You can train a model on AMASS dataset using the following command:

python main_amass.py --mode train --epochs 50 --dataset AMASS --data all --joints 18 --input_n 50 --output_n 25 --data_dir data_dir --output_dir model_amass

Then you can evaluate it on both AMASS and 3DPW datasets:

python main_amass.py --mode test --dataset AMASS --data all --joints 18 --input_n 50 --output_n 25 --data_dir data_dir --output_dir model_amass
python main_amass.py --mode test --dataset 3DPW --data all --joints 18 --input_n 50 --output_n 25 --data_dir data_dir --output_dir model_amass

The results will be saved in csv files in the output directory.

Work in Progress

This repository is being updated so stay tuned!

Acknowledgments

The overall code framework (dataloading, training, testing etc.) was adapted from HRI. The base of the diffusion was borrowed from CSDI.

Citation

@INPROCEEDINGS{saadatnejad2023diffusion,
  author = {Saeed Saadatnejad and Ali Rasekh and Mohammadreza Mofayezi and Yasamin Medghalchi and Sara Rajabzadeh and Taylor Mordan and Alexandre Alahi},
  title = {A generic diffusion-based approach for 3D human pose prediction in the wild},
  booktitle={International Conference on Robotics and Automation (ICRA)}, 
  year  = {2023}
}

License

AGPL-3.0 license

deposit's People

Contributors

saeedsaadatnejad avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.