Giter Site home page Giter Site logo

kv2000 / ash Goto Github PK

View Code? Open in Web Editor NEW
96.0 8.0 9.0 87 KB

The Training and Demo code for: Ash: Animatable gaussian splats for efficient and photoreal human rendering (CVPR 2024)

Home Page: https://vcai.mpi-inf.mpg.de/projects/ash/

Python 99.72% Shell 0.28%
computer-graphics computer-vision digital-human neural-rendering photorealistic-rendering realtime-rendering

ash's Introduction

ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering (CVPR 2024)

Haokai Pang† · Heming Zhu† · Adam Kortylewski · Christian Theobalt · Marc Habermann‡

Joint first authors.

Corresponding author.


News

2024-6-14 The Training Code, and the Data Processing Code is available! 🎆🎆🎆

2024-3-29 The initial release, i.e., the Demo Code is available. The Training Code is on the way. For more details, pleaase check out the project page😃.


Installation

Clone the repo

git clone [email protected]:kv2000/ASH.git --recursive

cd ./submodules/diff-gaussian-rasterization/
git submodule update --init --recursive

Install the dependencies

The code is tested on Python 3.9, pytorch 1.12.1, and cuda 11.3.

Setup DeepCharacters Pytorch

Firstly, install the underlying clothed human body model, 🎆DeepCharacters Pytorch🎆, which also consists the dependencies that needed for this repo.

Setup 3DGS

Then, setup the submodules for 3D Gaussian Splatting.

# the env with DeepCharacters Pytorch
conda activate mpiiddc 

# 3DGS go
cd ./submodules/diff-gaussian-rasterization/
python setup.py install

cd ../simple-knn/
python setup.py install

Setup the metadata and checkpoints

You may find the metadata and the checkpoints from this link.

The extracted metadata and checkpoints follows folder structure below

# for the checkpoints
checkpoints
|--- Subject0001
    |---deformable_character_checkpoint.pth # character checkpoints
    |---gaussian_checkpoints.tar            # gaussian checkpoints

# for the meta data
meta_data
|--- Subject0001
    |---skeletoolToGTPose                   # training poses
    |   |--- ... 
    |
    |---skeletoolToGTPoseTest               # Testing poses
    |   |--- ...
    |
    |---skeletoolToGTPoseRetarget           # Retartget another subject's pose
    |   |--- ...
    |
    |--- ...                                # Others

Run the demo

Run the following and the results will be stored in ./dump_results/ by default.

bash run_inference.sh

Train your model

Step 1. Data Processing

  • Download the compressed raw data from from this link in to ./raw_data/ .
  • Decompress the data with tar -xzvf Subject0022.tar.gz
  • Run the (slurm) bash script ./process_video/bash_get_image.sh that extracts the masked images from the raw RGB videoes and the foreground mask videoes . The provided script supports parallel the image extraction with slurm job arrays.

Step 2. Start Training

Run the following and the results will be stored in ./dump_results/ by default.

bash run_train.sh

The folder structure for the training is as follows:

# for the meta data
dump_results
|--- Subject0022
    |---cached_files                                # The precomputed character related
    |   |--- cached_fin_rotation_quad.pkl
    |   |--- cached_fin_translation_quad.pkl
    |   |--- cached_joints.pkl
    |   |--- cached_ret_canonical_delta.pkl
    |   |--- cached_ret_posed_delta.pkl
    |   |--- cached_temp_vert_normal.pkl
    |
    |---checkpoints                               
    |   |--- ...
    |
    |---exp_stats                                   # Tensorboard Logs
    |   |--- ...
    |
    |---validations_fine                            # Validationed images every X Frames

Note that at the first time that the training script runs, it will pre-compute and store the character related data, stored in ./dump_results/[Subject Name]/cached_files/. Which will greatly speed up and reduce the gpu usage of the training process.

Step 3. Train with your own data.

Plese check out this issue on some hints on training on your own data, discussion is welcomed :).


Todo list

  • Data processing for Training
  • Training Code

Citation

If you find our work useful for your research, please, please, please consider citing our paper!

@InProceedings{Pang_2024_CVPR,
    author    = {Pang, Haokai and Zhu, Heming and Kortylewski, Adam and Theobalt, Christian and Habermann, Marc},
    title     = {ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {1165-1175}
}

Contact

For questions, clarifications, feel free to get in touch with:
Heming Zhu: [email protected]
Marc Habermann: [email protected]


License

Deep Characters Pyotrch is under CC-BY-NC license. The license applies to the pre-trained models and the metadata as well.


Acknowledgements

Christian Theobalt was supported by ERC Consolidator Grant 4DReply (No.770784). Adam Kortylewski was supported by the German Science Foundation (No.468670075). This project was also supported by the Saarbrucken Research Center for Visual Computing, Interaction, and AI. We would also like to thank Andrea Boscolo Camiletto and Muhammad Hamza Mughal for the efforts/discussion on motion retargeting.

Below are some resources that we benefit from (keep updating):

ash's People

Contributors

kv2000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ash's Issues

Some hints on training with your own data

Since our approach requires only the motion textures as input conditions, it is possible, and intuitive, to adapt it for different kinds of drivable human templates.

Assume that you have a skinned/drivable template mesh with a UV paradigm.
Since we have provided the tools in the training dataloader, that render the info attached on the vertexes to the textures, it would be intuitive to adapt it for training on other drivable human models with the following ingredients:

  • The Canonical-pose vertex Position (referring to cached_ret_canonical_delta.pkl)
  • The Posed vertex Position (referring to cached_ret_posed_delta.pkl)
  • The Posed vertex Normal (referring to cached_temp_vert_normal.pkl)
  • The Rotation and translation quaternions for each vertex (referring to cached_fin_rotation_quad.pkl, cached_fin_translation_quad.pkl)
  • The Joint positions (referring to cached_joints.pkl)

Some videos in the training dataset are corrupted.

Excellent work!
I downloaded the dataset you provided a few days ago and found that in Subject0056, the videos numbered 029 and onwards in training/foregroundSegmentation cannot be opened properly. Is there an issue with the data?

About ASH GUI

If I want to get the GUI interface in your project description, what should I do?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.