Giter Site home page Giter Site logo

scarf's Introduction

SCARF: Capturing and Animation of Body and Clothing from Monocular Video

teaser

This is the Pytorch implementation of SCARF. More details please check our Project page.

SCARF extracts a 3D clothed avatar from a monocular video.
SCARF allows us to synthesize new views of the reconstructed avatar, and to animate the avatar with SMPL-X identity shape and pose control. The disentanglement of thebody and clothing further enables us to transfer clothing between subjects for virtual try-on applications.

The key features:

  1. animate the avatar by changing body poses (including hand articulation and facial expressions),
  2. synthesize novel views of the avatar, and
  3. transfer clothing between avatars for virtual try-on applications.

Getting Started

Clone the repo:

git clone https://github.com/yfeng95/SCARF
cd SCARF

Requirements

conda create -n scarf python=3.9
conda activate scarf
pip install -r requirements.txt

If you have problems when installing pytorch3d, please follow their instructions.

Download data

bash fetch_data.sh

Visualization

  • check training frames:
python main_demo.py --vis_type capture --frame_id 0 
  • novel view synthesis of given frame id:
python main_demo.py --vis_type novel_view --frame_id 0 
  • extract mesh and visualize
python main_demo.py --vis_type extract_mesh --frame_id 0

You can go to our project page and play with the extracted meshes.

  • animation
python main_demo.py --vis_type animate
  • clothing transfer
# apply clothing from other model 
python main_demo.py --vis_type novel_view --clothing_model_path exps/snapshot/male-3-casual
# transfer clothing to new body
python main_demo.py --vis_type novel_view --body_model_path exps/snapshot/male-3-casual

More data and trained models can be found here, you can download and put them into ./exps.

Training

  • training with SCARF video example
bash train.sh
  • training with other videos
    check here to prepare data with your own videos, then change the data_cfg accordingly.

TODO

  • add more processed data and trained models
  • code for refining the pose of trained models
  • with instant ngp

Citation

@inproceedings{Feng2022scarf,
    author = {Feng, Yao and Yang, Jinlong and Pollefeys, Marc and Black, Michael J. and Bolkart, Timo},
    title = {Capturing and Animation of Body and Clothing from Monocular Video},
    year = {2022},
    booktitle = {SIGGRAPH Asia 2022 Conference Papers},
    articleno = {45},
    numpages = {9},
    location = {Daegu, Republic of Korea},
    series = {SA '22}
} 

Acknowledgments

We thank Sergey Prokudin, Weiyang Liu, Yuliang Xiu, Songyou Peng, Qianli Ma for fruitful discussions, and PS members for proofreading. We also thank Betty Mohler, Tsvetelina Alexiadis, Claudia Gallatz, and Andres Camilo Mendoza Patino for their supports with data.

Special thanks to Boyi Jiang and Sida Peng for sharing their data.

Here are some great resources we benefit from:

Some functions are based on other repositories, we acknowledge the origin individually in each file.

License

This code and model are available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using the code and model you agree to the terms in the LICENSE.

Disclosure

MJB has received research gift funds from Adobe, Intel, Nvidia, Meta/Facebook, and Amazon. MJB has financial interests in Amazon, Datagen Technologies, and Meshcapade GmbH. While MJB is a part-time employee of Meshcapade, his research was performed solely at, and funded solely by, the Max Planck Society. While TB is part-time employee of Amazon, this research was performed solely at, and funded solely by, MPI.

Contact

For more questions, please contact [email protected] For commercial licensing, please contact [email protected]

scarf's People

Contributors

yfeng95 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scarf's Issues

About training

Thanks for the excellent work!

When I try to train the stage 0 from scratch (without using the pretrained model provided in the code), I got all black images.
The code works well when I try to train it with the pretrained model.
What might be the reason for this?

Question about sampling rays

Hi, thanks for the great work! I have 2 questions:

  1. Why is camera intrinsics and pose not used in sampling rays?
  2. Where is the customized volumetric rendering?

training errors

image
When starting to train the hybrid model, the loss function appears to be anomalous and the images show all black images.
It seems to be caused by the ID-MRF loss function.

scarf_utilities is not accessible

Hello, the link you provided in fetch_data.sh for the scarf_utilities.zip is not accessible. Could you please share it again?

Thank you.

extraction of clothes mesh

Thanks for your great work!
I wonder if we could extract the clothes mesh seperately, and turn them to T-pose?
Besides, the reconstructed clothes are one-piece, is there any way to seperate the clothes into upper body and lower body?

How can we obtain 'full_pose' parameter in 'pixie_radioactive.pkl'?

Hello. First of all, thank you for revealing your awesome works with open source.

I am curious how can we obtain 'full_pose' parameter, which consists of 5533 size tensor per frame, in your given demo pose file 'pixie_radioactive.pkl'.

I think this can be driven by PIXIE, but I don't know how can we get it.

PIXIE parameters from the preprocessing have different shapes compared to the given one.

Thank you.

1

zz

The issue about training hybrid stage.

Thanks for your great work.

When I train the hybrid stage, the losses become Nan, as following grpah illustrates.
图片

And also, the process of optimization is very slow(200 steps about 30 minutes.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.