ustc3dv / nerfblendshape-code Goto Github PK

View Code? Open in Web Editor NEW

209.0 209.0 18.0 676 KB

Python 57.84% C++ 0.68% Cuda 33.17% C 7.38% Shell 0.93%

nerfblendshape-code's People

Contributors

Stargazers

Watchers

Forkers

zivzone peterzs dfqytcom bruinxiong kirstihly yufan1012 phoenixdigitalfx peibinchen 0iui0 match08 xuleran zardyuan wasahaiah jackzhousz kuoenterprises nlml

nerfblendshape-code's Issues

Hey,
Thanks for releasing the code for this amazing paper. The inference script mentions that the last 500 frames would be used as a test set which would belong to the same speaker. Does this code work for motion/expression transfer from another driving speaker and if yes, how can I go forward with it?

The dataset you provided are corrupted. Some data are missing.

AttributeError: '_module_functionBackward' object has no attribute 'set_materialize_grads'

Thank you for your wonderful work!!!

When I run the inference code, I encountered the following problems. Could you give me some suggestions? Thank you so much！
（Environment: Ubuntu 18.04, 2080Ti, cuda 10.2, pytorch 1.6.0, python3.8, gcc 8.1.0, cmake 3.21）

  File "....../NeRFBlendShape/nerf/network_tcnn.py", line 127, in forward
    h = self.sigma_net(x)
  File "....../lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "....../lib/python3.8/site-packages/tinycudann-1.7-py3.8-linux-x86_64.egg/tinycudann/modules.py", line 177, in forward
    output = _module_function.apply(
  File "....../lib/python3.8/site-packages/tinycudann-1.7-py3.8-linux-x86_64.egg/tinycudann/modules.py", line 87, in forward
    ctx.set_materialize_grads(False)
AttributeError: '_module_functionBackward' object has no attribute 'set_materialize_grads'

question about dataset

I'd like to express my gratitude for your valuable code contributions. Your work has been incredibly helpful.
I'm interested in creating my own dataset and would appreciate if you could provide guidance on the specific steps involved. I'm very thankful for your assistance.

about the computational efficiency of updating the density grid

您好,

关于density grid的计算效率，有个问题想要请教一下：
以我目前对Expression-Aware Density Grid Update的理解, 我把这个过程写成如下的伪代码，逻辑是顺序地对于表情系数的每一维取最大值，以及平均脸系数设为1，这样调用nerf的query_density就能得到论文中式（9），h^i这个hashgrid对应的density，遍历表情系数所有维度后，对所有的density grid取element-wise maximum就能得到最终density grid。
我想问一下这样更新density grid的逻辑是否正确？以及这样做的话会在每次更新density grid的时候需要请求3DMM_expression_coef_dim次nerf的get_density函数，这样计算效率会不会很低呢？

tmp_grid_max_expr = torch.zeros_like(self.density_grid)
for i in range(1, 3DMM_expression_coef_dim + 1): # start from 1 because index 0 denotes the mean face
    rays_expr = (tensor, whose shape is [1, 3DMM_expression_coef_dim])
    # assume dim 0 of the expr_max denotes the mean face, and its value is always 1.
    rays_expr[:, 0] = 1
    rays_expr[:, i] = expr_max[:, i]
    tmp_grid = torch.zeros_like(densiy_grid)

    for xysz in all_points_from_density_grid:
        # query density conditioning on the expression coefficients
        sigmas = query_density(xysz, rays_expr)
        # update current density grid
        tmp_grid[xyzs] = sigmas
    # take the element-wise maximum of all density grids to get the final density grid
    tmp_grid_max_expr = torch.maximum(tmp_grid_max_expr, tmp_grid)

Confusion about the background of the picture

This is really a great job.

I have some small confusion, but I cannot verify myself due to the limitation of data set and equipment. I hope you can help me solve the confusion.

The backgrounds of the figures in the data are all solid colors, whether this is to reduce the computational amount of reconstruction, or because the manipulation of the head will affect the background area, such as distortion.

Question about expression-aware density grid update strategy

Hi,Thanks for your amazing work! I am confused with the expression-aware stragy in your paper: Does the 'h' mean the hash table or the density grid? If it is hash table, how to get the density grid using new hash table?

How to extract mesh?

Hi, I want to know if we can get mesh from this kind of method?

Comparisons to NeRFace

Hi,

Congrats on the paper!

Watching the videos I'm curious about the flickering in nerface, could it be that you don't freeze the latent codes during inference?

Can you share the face mask for the dataset

Hi, thanks for your excellent work!
I tried to obtain the face mask for each frame using the face-parsing.PyTorch repo as suggested in the paper, but the results are not as good as Fig 6, see the Obama example predicted directly from face-parsing.PyTorch:

Can you release the face mask or provide some insights on obtaining more accurate masks?

Thanks.

when will you release the source code?

数据集没有声音

作者您好，感谢您即将开源的出色工作！
您存放在谷歌网盘上的视频数据集没有声音，请问是否提供音频？

About using a new video as dataset

Hi!
Thanks for your great work!

I've done testing the network with the provided training dataset and I have received amazing performance.
But I wonder, if I want to use a new video, for example, a video of myself, how can I get those expression coefficients, like in the max_46.txt, min_46.txt, as well as the transforms.json?

about the param "xyzstoframes"

I am curious about the parameter "xyzstoframes" used in the code, where it seems not used in inference. I wonder what is the param for. Thank you!

What is the definition of the 47 blendshapes?

Hi,
I wonder the definition of each dimension in your blendshape. Where can I find that?
Thanks.

Will model training code release?

Dear author,

That's really an amazing work! We want to make some benchmark on your work which need training code, will you release this part?

Regards

What is exp_ori in the given dataset?

Hi,

Thank you so much for the excellent code and for at least making the inference code avaliable!

May I ask a quick question -- in the released dataset there seems to be a exp_ori, which is passed all the way to the forward(), but is actually not used (only exp is used). May I ask what exactly is exp_ori?

Many thanks!

What is the meaning of the parameter 'index'?

Hi,
Thanks for your opensource code. I'm curious about the 'index' parameter at https://github.com/USTC3DV/NeRFBlendShape-code/blob/main/nerf/network_tcnn.py#L106 and https://github.com/USTC3DV/NeRFBlendShape-code/blob/main/nerf/renderer.py#L147. How is it used when training?
Thanks.

Discussion of camera error

You mentioned in the limitation of your paper that The camera parameters and input conditions are important for NeRF based techniques. Large errors in tracking may cause losing details in our constructed model.

So why not add a learnable code like Nerface to the MLP along with the position code and expression coefficient to compensate for errors in facial expression and posture estimation

Question about density grid

Can you explain more about how to use the density grid when training. I get bad result when I reproduce the paper (I use density grid code in torch-ngp).

Artifacts around hair in the reproduced results

Hi, thanks for your work! I tried to reproduce the results and reimplement the training code of NeRFBlendshape. Currently, the rendering quality on the validation set is overall similar to the released model:

But there are some artifacts around the hair, especially on the side head around the ear, shown in the figure below:

I think the results are somewhat unexpected since the rendering quality is good in other regions, including some facial hair like mustache and eyebrow. I wonder whether you have met this problem during training. Can you provide some suggestions to avoid this artifact? Thanks a lot!

Here are my training schedules:

0~7 epoch: L1 color loss is only applied to randomly sampled pixels, with weight 1;
7~15 epoch: 0.5 prob to sample 32x32 patch and 0.5 prob to sample random pixels; when sample random pixels, L1 color loss is applied, with weight 1; when sample random patch, L1 color loss and LPIPS (vgg backbone) are applied, with both weight 0.1. When sample patch, 0.5 prob to sample around the mouth, and 0.5 prob for uniform sample in the image;
the batch size is set to 1, i.e., only sample patch or random pixels in one image.

Another question is about the dataset. I find that there are N+1 frames in the provided mp4, but there are only N annotations in the json files. Is the 0th annotation corresponds to the 0th frame or the 1st frame?

before use it , should i need to build the cuda extension?

pip install .

should I run it?

It seems I have to run it to use cuda&cpp extension

What's the method used to segment the front face?

The segmented face in your video is smooth and accurate, however when I segment the face using face-parsing.PyTorch, the result is flicking.

question about the ExpHashEncoder implementation

Hi, thanks for solid your work and code!

I wonder why not directly use the hash encoder provided in the tinycudann lib in the implementation? Are there some special concerns? Is the hash encoder provided by the tinycudann lib not compatible with the cuda accelerated ray marching?

Thanks!

question about your evaluation

Hello, thanks for your great work and for making code available.

I have a question about the results of table 2 in your paper. Are those table metrics computed on the 8 identities that you make publicly available here?

Also, do you compute the metrics with white background or the static background extracted from the images?

how provide new exprssion and pose to reenact face

Where should I modify the code so that the model can follow the emoticons and postures I provide, or control the postures and postures of render out.

code releasing?

TypeError: _composite_raysBackward.forward: expected Tensor or tuple of Tensor (got NoneType) for return value 0

Thank you for your wonderful work！
I ran the inference code and got this error.

depth, image,exps_code = _run(rays_o, rays_d,exps,exp_ori, bound, num_steps, bg_color) 
File "....../NeRFBlendShape/nerf/renderer.py", line 130, in run_cuda
raymarching.composite_rays(n_alive, n_step, rays_alive[i % 2], rays_t[i % 2], sigmas, rgbs, deltas, weights_sum, depth, image)
TypeError: _composite_raysBackward.forward: expected Tensor or tuple of Tensor (got NoneType) for return value 0

NeRFBlendShape-code/raymarching/raymarching.py

Line 66 in d198ac2

    
           def forward(ctx, n_alive, n_step, rays_alive, rays_t, sigmas, rgbs, deltas, weights, depth, image):

It seems that the "forward" function lacks "return XXX". Could you give me some suggestions？Thank you so much！！！

about code releasing

Great work! I wonder when you will release the code, I think it can inspire more future works!

meaning of min_46 max_46.txt

Hi
thanks for your great work!
Is it the range of expression coefficients? and how do you get it ?
Is it necessary for training or inference?

Question about Equ. 9 of the paper

Hi! I am confused of the Equ. 9 of the paper, where the equation is about how to update the density grid.
According to the equation, for each expression basis, we compute a new multi-resolution hashtable h^i. My question is: what is the relationship between the newly computed hashtable h^i and the density grid? Are they equivalent? If not, how to update the density grid according to h^i?
Looking forward to your reply!

Advice on generating own tracking data

Hi,

First of all, amazing work!

I had questions about what are max_per and min_per doing in the code and how can we calulate them? And are expression coefficients the same as expression parameters obtained from a 3DMM model? I basically want to know how this combination is taking place and how can I generate my own data for it using a 3DMM tracker.

Thanks

TypeError: 'module' object is not callable

Thanks for your great work!
I ran the inference code, and got the issue:

it seems the error from the function: compact_rays
Could you give me some suggestions? Thank you so much！

New expression

When you input a new expression does the model have to be re-trained everytime? I see that you concatenate the queried feature with the expression code so it seems like every time there's a new expression code we have to train the model again. How is inference handled in this model? Very nice paper btw with what looks like great results!

[Render] What is the difference of the codes between NeRFBlendShape-code and ngp?

Hi,
I notice that the render codes of NeRFBlendshape-code is slightly different from that of ngp. Can I directly use the render codes in ngp to train and test the NeRFBlendshape model?

about basis num

Hello, thank you for code releasing. I have a question about the basis number.
The dimension of the expression coefficient is set to 46 according to the paper. But in the code, it is set to 17. I wonder why the dimension is smaller than that of the paper described. Hoping for your answer!