Giter Site home page Giter Site logo

rasmusrpaulsen / deep-mvlm Goto Github PK

View Code? Open in Web Editor NEW
124.0 8.0 26.0 11.85 MB

A tool for precisely placing 3D landmarks on 3D facial scans based on the paper "Multi-view Consensus CNN for 3D Facial Landmark Placement"

Home Page: http://shapeml.compute.dtu.dk/

License: MIT License

Python 100.00%
landmark-detection 3d-models deep-learning neural-network hourglass-network pytorch-implementation landmark-estimation 3d-landmarks facial-landmarks 3d-face-recognition

deep-mvlm's Introduction

Deep learning based 3D landmark placement

A tool for accurately placing 3D landmarks on 3D facial scans based on the paper Multi-view Consensus CNN for 3D Facial Landmark Placement.

Overview

Citing Deep-MVLM

If you use Deep-MVLM in your research, please cite the paper:

@inproceedings{paulsen2018multi,
  title={Multi-view Consensus CNN for 3D Facial Landmark Placement},
  author={Paulsen, Rasmus R and Juhl, Kristine Aavild and Haspang, Thilde Marie and Hansen, Thomas and Ganz, Melanie and Einarsson, Gudmundur},
  booktitle={Asian Conference on Computer Vision},
  pages={706--719},
  year={2018},
  organization={Springer}
}

Updates

  • 24-03-2021 : Hopefully the "cannot instantiate 'WindowsPath'" issue should now be solved. Pre-trained models no longer contain path variables.

Getting Deep-MVLM

Download or clone from github

Requriements

The code has been tested under Windows 10 both with a GPU enabled (Titan X) computer and without a GPU (works but slow). It has been tested with the following dependencies

  • Python 3.7
  • Pytorch 1.2
  • vtk 8.2
  • libnetcdf 4.7.1 (needed by vtk)
  • imageio 2.6
  • matplotlib 3.1.1
  • scipy 1.3.1
  • scikit-image 0.15
  • tensorboard 1.14
  • absl-py 0.8

Getting started

The easiset way to use Deep-MVLM is to use the pre-trained models to place landmarks on your meshes. To place the DTU-3D landmarks on a mesh try:

python predict.py --c configs/DTU3D-RGB.json --n assets/testmeshA.obj

This should create two landmarks files (a .vtk file and a .txt) file in the assets directory and also show a window with a face mesh with landmarks as (its a 3D rendering that can be manipulated with the mouse):

Predicted output

Supported formats and types

The framework can place landmarks on surface without textures, with textures and with vertex coloring. The supported formats are:

  • OBJ textured surfaces (including multi textures), non-textured surfaces
  • WRL textured surfaces (only single texture), non-textured surfaces
  • VTK textured surfaces (only single texture), vertex colored surfaces, non-textured surfaces
  • STL non-textured surfaces
  • PLY non-textured surfaces

Rendering types

The type of 3D rendering used is specified in the image_channels setting in the JSON configuration file. The options are:

  • geometry pure geometry rendering without texture (1 image channel)
  • depth depth rendering (the z-buffer) similar to range scanners like the Kinect (1 image channel)
  • RGB texture rendering (3 image channels)
  • RGB+depth texture plus depth rendering (3+1=4 image channels)
  • geometry+depth geometry plus depth rendering (1+1=2 image channels)

Pre-trained networks

The algorithm comes with pre-trained networks for the landmark sets DTU3D consisting of 73 landmarks that are described in this paper and here and the landmark set from BU-3DFE described further down.

Predict landmarks on a single scan

First determine what landmark set you want to place. Either DTU3D or BU-3DFE. Secondly, choose the rendering type suitable for your scan. Here are some recommendations:

  • surface with RGB texture use RGB+depth or RGB
  • surface with vertex colors use RGB+depth or RGB
  • surface with no texture use geometry+depth, geometry or depth

Now you can choose the JSON config file that fits your need. For example configs/DTU3D-RGB+depth.json. Finally, do the prediction:

python predict.py --c configs/DTU3D-RGB+depth.json --n yourscan

Predict landmarks on a directory with scans

Select a configuration file following the approach above and do the prediction:

python predict.py --c configs/DTU3D-RGB+depth.json --n yourdirectory

where yourdirectory is a directory (or directory tree) containing scans. It will process all obj, wrl, vtk, stl and ply files.

Predict landmarks on a file with scan names

Select a configuration file following the approach above and do the prediction:

python predict.py --c configs/DTU3D-RGB+depth.json --n yourfile.txt

where yourfile.txt is a text file containing names of scans to be processed.

Specifying a pre-transformation

The algorithm expects that the face has a general placement and orientation. Specifically, that the scan is centered around the origin and that the nose is pointing in the z-direction and the up of the head is aligned with the y-axis as seen here:

coord-system

In order to re-align a scan to this system, a section of the JSON configuration file can be modified:

"pre-align": {
	"align_center_of_mass" : true,
	"rot_x": 0,
	"rot_y": 0,
	"rot_z": 180,
	"scale": 1,
	"write_pre_aligned": true
}

Here the scan is first aligned so the center-of-mass of the scan is aligned to the origo. Secondly, it is rotated 180 degrees around the z-axis. The rotation order is z-x-y. This will align this scan:

mri-coord-system

so it is aligned for processing and the result is:

mri-results3

this configuration file can be found as configs/DTU3D-depth-MRI.json

How to use the framework in your own code

Detect 3D landmarks in a 3D facial scan

import argparse
from parse_config import ConfigParser
import deepmvlm
from utils3d import Utils3D

dm = deepmvlm.DeepMVLM(config)
landmarks = dm.predict_one_file(file_name)
dm.write_landmarks_as_vtk_points(landmarks, name_lm_vtk)
dm.write_landmarks_as_text(landmarks, name_lm_txt)
dm.visualise_mesh_and_landmarks(file_name, landmarks)

The full source (including how to read the JSON config files) is predict.py

Examples

The following examples use data from external sources.

Artec3D Eva example

Placing landmarks on an a scan produced using an Artec3D Eva 3D scanner can be done like this:

  • download the example head scan in obj format
  • then:
python predict.py --c configs\DTU3D-RGB_Artec3D.json --n Pasha_guard_head.obj

Artec3D

  • download the example man bust in obj format
  • then:
python predict.py --c configs\DTU3D-depth.json --n man_bust.obj

Artec3D

Using Multiple GPUs

Multi-GPU training and evaluation can be used by setting n_gpu argument of the config file to a number greater than one. If configured to use a smaller number of GPUs than available, n devices will be used by default. To use a specific set of GPUs the command line option --device can be used:

python train.py --device 2,3 --c config.json

The program check if a GPU is present and if it has the required CUDA capabilities (3.5 and up). If not, the CPU is used - will be slow but still works.

How to train and use Deep-MVLM with the BU-3DFE dataset

The Binghamton University 3D Facial Expression Database (BU-3DFE) is a standard database for testing the performance of 3D facial analysis software tools. Here it is described how this database can be used to train and evaluate the performance of Deep-MVLM. The following approach can be adapted to your own dataset.

Start by requesting and downloading the database from the official BU-3DFE site

Secondly, download the 3D landmarks for the raw data from Rasmus R. Paulsens homepage. The landmarks from the original BU-3DFE distribution is fitted to the cropped face data. Unfortunately, the raw and cropped face data are not in alignment. The data fra Rasmus' site has been aligned to the raw data, thus making it possible to train and evaluate on the raw face data. There are 84 landmarks in this set end they are defined here.

A set of example JSON configuration files are provided. Use for example configs/BU_3DFE-RGB_train_test.json and modify it to your needs. Change raw_data_dir, processed_data_dir, data_dir (should be equal to processed_data_dir) to your setup.

Preparing the BU-3DFE data

In order to train the network the data should be prepared. This means that we pre-render a set of views for each input model. On the fly rendering during training is too slow due to the loading of the 3D models. Preparing the data is done by issuing the command:

python preparedata --c configs/BU_3DFE-RGB_train_test.json

This will pre-render the image channels rgb, geometry, depth. If the processed_data_dir is set to for example D:\data\BU-3DFE_processed\, the rendered images will be placed in a folder D:\data\BU-3DFE_processed\images\ and the corresponding 2D landmarks in a folder D:\data\BU-3DFE_processed\2D LM\. The renderings should look like this:

RGB renderinggeometry renderingdepth rendering

The corresponding landmark file is a standard text file with landmark positions corresponding to their placement in the rendered images. This means that this dataset can now be used to train a standard 2D face landmark detector.

The dataset will also be split into a training and a test set. The ids of the scans used for training can be found in the dataset_train.txt file and the test set in the dataset_test.txt file. Both files are found in the processed_data_dir.

Training on the BU-3DFE pre-rendered data

To do the training on the pre-rendered images and landmarks the command

python train --c configs/BU_3DFE-RGB_train_test.json

is used. The result of the training (the model) will be placed in a folder saved\models\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\, where the saved folder can be specified in the JSON configuration file. DDMMYY_HHMMSS is the current date and time. A simple training log can be found in saved\log\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\. After training, it is recommended to rename and copy the best trained model best-model.pth to a suitable location. For example **saved\trained\*.

Tensorboard visualisation

Tensorboard visualisation of the training and validation losses can be enabled in the JSON configuration file. The tensorboard data will be placed in the saved\log\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\ directory.

Resuming training

If training is stopped for some reason, it is possible to resume training by using

python train --c configs/BU_3DFE-RGB_train_test.json --r path-to-model\best-model.pth

where path-to-model is the path to the current best model (for example saved\models\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\).

Evaluating the model trained on the BU-3DFE data

In order to evaluate the performance of the trained model, the following command is used:

python test --c configs/BU_3DFE-RGB_train_test.json --r path-and-file-name-of-model.pth

where path-and-file-name-of-model.pth is the path and filename of the model that should be tested. It should match the configuration in the supplied JSON file. Test results will be placed in a folder named saved\temp\MVLMModel_BU_3DFE\DDMMYY_HHMMSS\. Most interesting is the results.csv that lists the distance error for each landmark for each test mesh.

Team

Rasmus R. Paulsen and Kristine Aavild Juhl

License

Deep-MVLM is released under the MIT license. See the LICENSE file for more details.

Credits

This project is based on the PyTorch template pytorch-template by Victor Huang

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research.

deep-mvlm's People

Contributors

rasmusrpaulsen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep-mvlm's Issues

The landmark placing on the MRI head model

Hi everybody,

I have two questions:

  1. Can I place the facial landmarks on the following MRI head model (STL or PLY or OBJ) with Deep-MVLM? It has 500.000 vertices and 800.000 faces. However, there is no information other than the STL file.

image

  1. By navigating around a head prototype with the depth camera, I obtained 10 RGB-D data. Can I create the head model (3d head.stl) using this code?

Thank you for this sharing

Is it able to support the xyz format file?

First of all, thanks to share your great 3D face landmark project.
I am trying to recognize a face from a .xyz cloud point data file, however you support OBJ/WRL/VTK/PLY only, because I am using a TOF sensor and able to capture xyz data only.
So would you please help me to make sure the xyz format able to support in your project.

Thanks & Best Regards
Sui

How to get Deep-MVLM style shading on mesh?

image

On the left is the my mesh put through the Deep-MVLM algorithm and visualised with its built in functions. On the right is when I open the same mesh in mesh lab. How do I get the lighting/shading similar to how you all did it? It brings out the contours much better and it would help to know exactly how you did it.

Thanks.

Implementation of 3D landmark annotation

Hi,
Can you share the implementation of 3D landmark annotation of DTU3D ?
The orignal annotation of BU-3DFE dataset seems unsatisfactory and does not provide the annotation of the nose tip that your annotation does.

Any help would be appreciate !

AttributeError: 'NoneType' object has no attribute 'GetMapper'

Hi, thank you for your wonderful work, I have learned a lot from your code. But could you have me with this question?
I want to prepare rendered data using my raw data (.wrl, .bmp, and the landmark .txt files). But when run the preparedata.py file, I got the following problems. Unfortunately, I can not solve it. Do you know anything about it? I guess there is some format difference between your data and mine. If it is possible, could you share a set of example data for training? Maybe the data from DTU dataset because the BU_3DFE dataset is not available to download freely.

Preparing BU-3DFE data
Read 10 file ids
Processing 7 file ids for training
Processing F0001\F0001_AN02WH
F0001\F0001_AN02WH is locked - skipping
Processing F0001\F0001_AN03WH
F0001\F0001_AN03WH is locked - skipping
Processing F0001\F0001_AN04WH
F0001\F0001_AN04WH is locked - skipping
Processing F0001\F0001_DI01WH
Rendering F0001\F0001_DI01WH
Error near line 2: parse error
Traceback (most recent call last):
File "D:/07_AutomaticLandmarking/Paulsen_DeepMVLM/preparedata.py", line 349, in
main(cfg_global)
File "D:/07_AutomaticLandmarking/Paulsen_DeepMVLM/preparedata.py", line 334, in main
prepare_bu_3dfe_data(config)
File "D:/07_AutomaticLandmarking/Paulsen_DeepMVLM/preparedata.py", line 327, in prepare_bu_3dfe_data
process_file_bu_3dfe(config, base_name, output_dir)
File "D:/07_AutomaticLandmarking/Paulsen_DeepMVLM/preparedata.py", line 97, in process_file_bu_3dfe
pd = vrmlin.GetRenderer().GetActors().GetLastActor().GetMapper().GetInput()
AttributeError: 'NoneType' object has no attribute 'GetMapper'

How to pre-render images for my own dataset?

I have a dataset of 3D meshes of people (not BU-3DFE) and a set of points for each mesh that is different to the pre-trained points. I would like to train Deep-MVLM on my model. I understand its better to pre-render the different views before I train to save time. What command can I run to create these pre-renderings on my meshes?

Getting wrong landamrks with each config file

Hi!
Thank you for sharing your great work with the community.
I would like to ask wich config file should I use for the model below, I tried geometry and depth, but without sucess. Is it something wrong with my mesh? Thank you.

image

DTU3D-depth on the man_bust.obj example

Hello,
I have an issue when I am running the explained command : python predict.py --c configs\DTU3D-depth.json --n man_bust.obj
image

The landmarks does not calculate themself well and don't appear on the render window.
image

I did not change any part of the code.
What can I do ?

Thanks

DTU-3D dataset

Hi, Prof .Rasmus R. Paulsen,
Is the DTU-3D dataset public available?
I try to train the dataset with SOTA network to improve the precision.

Regarding custom dataset

Hi. I have a query regarding the implementation using custom dataset.
I have .obj mesh file (example screenshot is attached)
example

I wanted to test the model on it but I get the message "Not enough valid view lines for landmark" for all the landmarks.

I wanted to ask if you can guide on how to train the model on custom dataset and how to annotate.

White Image View

Hello, may I ask, if I test my own data, I find that first of all, my .obj file has vn, while the test data does not, will this affect? I also find that when I import the data into the model, the generated views are all white images. What is the reason? Thank you very much for your answer

NotImplementedError: cannot instantiate 'WindowsPath' on your system

When I try to run this code on an AWS SageMaker notebook instance:
python predict.py --c configs/DTU3D-RGB.json --n assets/testmeshA.obj

I run in to the following error:
Processing assets/testmeshA.obj
Initialising model
Loading checkpoint
Getting device
Warning: There's no GPU available on this machine,prediction will be performed on CPU.
Loading checkpoint: https://shapeml.compute.dtu.dk/Deep-MVLM/models/MVLMModel_DTU3D_RGB_07092019-c1cc3d59.pth
Traceback (most recent call last):
File "predict.py", line 70, in
main(global_config)
File "predict.py", line 51, in main
process_one_file(config, name)
File "predict.py", line 12, in process_one_file
dm = deepmvlm.DeepMVLM(config)
File "/home/ec2-user/SageMaker/Deep-MVLM/deepmvlm/api.py", line 37, in init
self.device, self.model = self._get_device_and_load_model_from_url()
File "/home/ec2-user/SageMaker/Deep-MVLM/deepmvlm/api.py", line 75, in _get_device_and_load_model_from_url
checkpoint = load_url(check_point_name, model_dir, map_location=device)
File "/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/hub.py", line 506, in load_state_dict_from_url
return torch.load(cached_file, map_location=map_location)
File "/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/serialization.py", line 529, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/serialization.py", line 702, in _legacy_load
result = unpickler.load()
File "/home/ec2-user/anaconda3/envs/pytorch_p36/lib/python3.6/pathlib.py", line 1002, in new
% (cls.name,))
NotImplementedError: cannot instantiate 'WindowsPath' on your system

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.