Giter Site home page Giter Site logo

conallwang / mega Goto Github PK

View Code? Open in Web Editor NEW
46.0 10.0 0.0 44.62 MB

The official implementation of "MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing".

Shell 0.63% Python 40.94% CMake 0.09% C++ 28.34% Cuda 19.91% C 2.09% Dockerfile 0.14% HTML 7.85%

mega's Introduction

MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing

The official repo for "MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing"

๐Ÿ“ฃ Updates

[07/8/2024] The data and pretrained models of Subject 306 have been released here!

[01/8/2024] The Codes has been released!

[06/5/2024] Add more results to the project page.

[28/4/2024] The official repo is initialized.

TODO

  • Release the project page
  • Add more results to the project page
  • Release the codes
  • Release the data and Subject 306's pretrained model.
  • Release more data & pretrained models (Subject 218 and 304), if we find more free cloud storage ~
  • Improve the performance and try to support more editing applications

Abstract

Creating high-fidelity head avatars from multi-view videos is a core issue for many AR/VR applications. However, existing methods usually struggle to obtain high-quality renderings for all different head components simultaneously since they use one single representation to model components with drastically different characteristics (e.g., skin vs. hair). In this paper, we propose a Hybrid Mesh-Gaussian Head Avatar (MeGA) that models different head components with more suitable representations. Specifically, we select an enhanced FLAME mesh as our facial representation and predict a UV displacement map to provide per-vertex offsets for improved personalized geometric details. To achieve photorealistic renderings, we obtain facial colors using deferred neural rendering and disentangle neural textures into three meaningful parts. For hair modeling, we first build a static canonical hair using 3D Gaussian Splatting. A rigid transformation and an MLP-based deformation field are further applied to handle complex dynamic expressions. Combined with our occlusion-aware blending, MeGA generates higher-fidelity renderings for the whole head and naturally supports more downstream tasks. Experiments on the NeRSemble dataset demonstrate the effectiveness of our designs, outperforming previous state-of-the-art methods and supporting various editing functionalities, including hairstyle alteration and texture editing.

Pipeline

pipeline_git

Setup

Environment

Here, we provide commands that are needed to build the conda environment:

# 1. create a new conda env & activate
conda create -n mega python=3.9
conda activate mega

# 2. run our scripts to install requirements
./create_env.sh

Data

We use the same 9 subjects from NeRSemble dataset as GaussianAvatars in our experiments. Based on their provided data, we additionally generate depth maps and face parsing results. All pre-processed data is provided here.

Whether you want to train or test our methods, you need to download the data and decompress it into somewhere, e.g., /path/to/nersemble

Training

To train a full MeGA avatar (taking Subject 306 as an example), you need to take two steps.

First, train a canonical hair model using

# Before execute the following commands, you need to change every path ('/path/to/...') to your specific path.
# Including files: ['./scripts/train_hair.sh', './configs/nersemble/306/hair.yaml']

cd /path/to/MeGA
bash ./scripts/train_hair.sh

After that, your hair model will be saved in your specified directory (i.e., $WORKSPACE/$VERSION/checkpoint_reset.pth).

Next, train the full avatar model using

# Also changing every path ('/path/to/...') to your specific path.
# Including files: ['./scripts/train_full.sh', './configs/nersemble/306/full.yaml']

cd /path/to/MeGA
bash ./scripts/train_full.sh

Testing (Including computing metrics)

If you want to only render images in the test dataset and valid dataset or compute metrics, you can run

cd /path/to/MeGA
bash ./scripts/metrics.sh

The script will render images first and then compute metrics automaticly.

Funny editting

As mentioned in our paper, MeGA supports some human head editing. All related codes are in ./funny_demo.

Hair alteration

To perform hair alteration (e.g., alternate Subject 218's hair to 306's hair), you can run

cd /path/to/MeGA
bash ./scripts/alter_hair.sh

Texture editting

We have provided some 2d painting images in the preprocessed data (/path/to/nersemble/preprocess/306/306_EMO-1_v16_DS2-0.5x_lmkSTAR_teethV3_SMOOTH_offsetS_whiteBg_maskBelowLine/images/00000_08_*.png).

You can also produce your own 2d painting images and put them to the 3d head avatar with our scripts.

cd /path/to/MeGA
bash ./scripts/paint.sh

This process will take some time (several minutes) to optimize.

Render videos using pre-trained models

We take the painted avatar above as an example. The painted avatar will be saved in somewhere like '/path/to/checkpoints/MeGA/0801/train_306_b16_MeGA/duola', and you can further render sequences using painted avatars:

cd /path/to/MeGA
bash ./scripts/render.sh

The results will be saved in somewhere like '/path/to/checkpoints/MeGA/0801/train_306_b16_MeGA/duola/exp3_eval'. If you want a video result, please execute './scripts/img2video.sh' (using ffmpeg).

cd /path/to/MeGA
bash ./scripts/img2video.sh /path/to/checkpoints/MeGA/0801/train_306_b16_MeGA/duola/exp3_eval/renders

The video can be generated in '/path/to/checkpoints/MeGA/0801/train_306_b16_MeGA/duola/exp3_eval/output.mp4'.

Pretrained Model

We provide our pretrained models here.

Citation

If you find this code useful for your research, please consider citing:

@article{wang2024mega,
  title={MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing},
  author={Wang, Cong and Kang, Di and Sun, He-Yi and Qian, Shen-Han and Wang, Zi-Xuan and Bao, Linchao and Zhang, Song-Hai},
  journal={arXiv preprint arXiv:2404.19026},
  year={2024}
}

mega's People

Contributors

conallwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mega's Issues

Request for some data

Dear authors,
I greatly appreciate the work you've done!
I would like to follow your work, but I noticed that your dataset only includes 306 avatar. I understand that due to cloud storage limitations, you couldn't upload everything, but could you upload only the modified parts, like init_pts_150000.npy, or provide the processing scripts instead?
I believe most of readers already have the overlapping parts with the GaussianAvatars dataset. All we need are just the modified parts.
I would greatly appreciate it!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.