Giter Site home page Giter Site logo

wham's Introduction

WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion

PyTorch report Project Open In Colab PWC PWC

demo.mp4

Introduction

This repository is the official Pytorch implementation of WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion. For more information, please visit our project page.

Installation

Please see Installation for details.

Quick Demo

Registration

To download SMPL body models (Neutral, Female, and Male), you need to register for SMPL and SMPLify. The username and password for both homepages will be used while fetching the demo data.

Next, run the following script to fetch demo data. This script will download all the required dependencies including trained models and demo videos.

bash fetch_demo_data.sh

You can try with one examplar video:

python demo.py --video examples/IMG_9732.mov --visualize

We assume camera focal length following CLIFF. You can specify known camera intrinsics [fx fy cx cy] for SLAM as the demo example below:

python demo.py --video examples/drone_video.mp4 --calib examples/drone_calib.txt --visualize

You can skip SLAM if you only want to get camera-coordinate motion. You can run as:

python demo.py --video examples/IMG_9732.mov --visualize --estimate_local_only

Dataset

Please see Dataset for details.

Evaluation

# Evaluate on 3DPW dataset
python -m lib.eval.evaluate_3dpw --cfg configs/yamls/demo.yaml TRAIN.CHECKPOINT checkpoints/wham_vit_w_3dpw.pth.tar

# Evaluate on RICH dataset
python -m lib.eval.evaluate_rich --cfg configs/yamls/demo.yaml TRAIN.CHECKPOINT checkpoints/wham_vit_w_3dpw.pth.tar

# Evaluate on EMDB dataset (also computes W-MPJPE and WA-MPJPE)
python -m lib.eval.evaluate_emdb --cfg configs/yamls/demo.yaml --eval-split 1 TRAIN.CHECKPOINT checkpoints/wham_vit_w_3dpw.pth.tar   # EMDB 1

python -m lib.eval.evaluate_emdb --cfg configs/yamls/demo.yaml --eval-split 2 TRAIN.CHECKPOINT checkpoints/wham_vit_w_3dpw.pth.tar   # EMDB 2

Training

Will be updated.

Acknowledgement

We would like to sincerely appreciate Hongwei Yi and Silvia Zuffi for the discussion and proofreading. Part of this work was done when Soyong Shin was an intern at the Max Planck Institute for Intelligence System.

The base implementation is largely borrowed from VIBE and TCMR. We use ViTPose for 2D keypoints detection and DPVO, DROID-SLAM for extracting camera motion. Please visit their official websites for more details.

TODO

  • Training implementation

  • Colab / Hugging face release

  • Demo for custom videos

Citation

@article{shin2023wham,
    title={WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion},
    author={Shin, Soyong and Kim, Juyong and Halilaj, Eni and Black, Michael J.},
    journal={arXiv preprint 2312.07531},
    year={2023}}

License

Please see License for details.

Contact

Please contact [email protected] for any questions related to this work.

wham's People

Contributors

yohanshin avatar rohaana avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.