Giter Site home page Giter Site logo

nerfblendshape-code's People

Contributors

ustc3dv avatar xuanghahahaha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nerfblendshape-code's Issues

Doubt regarding Inference

Hey,
Thanks for releasing the code for this amazing paper. The inference script mentions that the last 500 frames would be used as a test set which would belong to the same speaker. Does this code work for motion/expression transfer from another driving speaker and if yes, how can I go forward with it?

AttributeError: '_module_functionBackward' object has no attribute 'set_materialize_grads'

Thank you for your wonderful work!!!

When I run the inference code, I encountered the following problems. Could you give me some suggestions? Thank you so much!
(Environment: Ubuntu 18.04, 2080Ti, cuda 10.2, pytorch 1.6.0, python3.8, gcc 8.1.0, cmake 3.21)

  File "....../NeRFBlendShape/nerf/network_tcnn.py", line 127, in forward
    h = self.sigma_net(x)
  File "....../lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "....../lib/python3.8/site-packages/tinycudann-1.7-py3.8-linux-x86_64.egg/tinycudann/modules.py", line 177, in forward
    output = _module_function.apply(
  File "....../lib/python3.8/site-packages/tinycudann-1.7-py3.8-linux-x86_64.egg/tinycudann/modules.py", line 87, in forward
    ctx.set_materialize_grads(False)
AttributeError: '_module_functionBackward' object has no attribute 'set_materialize_grads'

question about dataset

I'd like to express my gratitude for your valuable code contributions. Your work has been incredibly helpful.
I'm interested in creating my own dataset and would appreciate if you could provide guidance on the specific steps involved. I'm very thankful for your assistance.

about the computational efficiency of updating the density grid

您好,

关于density grid的计算效率,有个问题想要请教一下:
以我目前对Expression-Aware Density Grid Update的理解, 我把这个过程写成如下的伪代码,逻辑是顺序地对于表情系数的每一维取最大值,以及平均脸系数设为1,这样调用nerf的query_density就能得到论文中式(9),h^i这个hashgrid对应的density,遍历表情系数所有维度后,对所有的density grid取element-wise maximum就能得到最终density grid。
我想问一下这样更新density grid的逻辑是否正确?以及这样做的话会在每次更新density grid的时候需要请求3DMM_expression_coef_dim次nerf的get_density函数,这样计算效率会不会很低呢?

tmp_grid_max_expr = torch.zeros_like(self.density_grid)
for i in range(1, 3DMM_expression_coef_dim + 1): # start from 1 because index 0 denotes the mean face
    rays_expr = (tensor, whose shape is [1, 3DMM_expression_coef_dim])
    # assume dim 0 of the expr_max denotes the mean face, and its value is always 1.
    rays_expr[:, 0] = 1
    rays_expr[:, i] = expr_max[:, i]
    tmp_grid = torch.zeros_like(densiy_grid)

    for xysz in all_points_from_density_grid:
        # query density conditioning on the expression coefficients
        sigmas = query_density(xysz, rays_expr)
        # update current density grid
        tmp_grid[xyzs] = sigmas
    # take the element-wise maximum of all density grids to get the final density grid
    tmp_grid_max_expr = torch.maximum(tmp_grid_max_expr, tmp_grid)

Confusion about the background of the picture

This is really a great job.

I have some small confusion, but I cannot verify myself due to the limitation of data set and equipment. I hope you can help me solve the confusion.

The backgrounds of the figures in the data are all solid colors, whether this is to reduce the computational amount of reconstruction, or because the manipulation of the head will affect the background area, such as distortion.

Comparisons to NeRFace

Hi,

Congrats on the paper!

Watching the videos I'm curious about the flickering in nerface, could it be that you don't freeze the latent codes during inference?

Can you share the face mask for the dataset

Hi, thanks for your excellent work!
I tried to obtain the face mask for each frame using the face-parsing.PyTorch repo as suggested in the paper, but the results are not as good as Fig 6, see the Obama example predicted directly from face-parsing.PyTorch:
00020

Can you release the face mask or provide some insights on obtaining more accurate masks?

Thanks.

数据集没有声音

作者您好,感谢您即将开源的出色工作!
您存放在谷歌网盘上的视频数据集没有声音,请问是否提供音频?

About using a new video as dataset

Hi!
Thanks for your great work!

I've done testing the network with the provided training dataset and I have received amazing performance.
But I wonder, if I want to use a new video, for example, a video of myself, how can I get those expression coefficients, like in the max_46.txt, min_46.txt, as well as the transforms.json?

about the param "xyzstoframes"

I am curious about the parameter "xyzstoframes" used in the code, where it seems not used in inference. I wonder what is the param for. Thank you!

Will model training code release?

Dear author,

That's really an amazing work! We want to make some benchmark on your work which need training code, will you release this part?

Regards

What is exp_ori in the given dataset?

Hi,

Thank you so much for the excellent code and for at least making the inference code avaliable!

May I ask a quick question -- in the released dataset there seems to be a exp_ori, which is passed all the way to the forward(), but is actually not used (only exp is used). May I ask what exactly is exp_ori?

Many thanks!

Discussion of camera error

You mentioned in the limitation of your paper that The camera parameters and input conditions are important for NeRF based techniques. Large errors in tracking may cause losing details in our constructed model.

So why not add a learnable code like Nerface to the MLP along with the position code and expression coefficient to compensate for errors in facial expression and posture estimation

image

Question about density grid

Can you explain more about how to use the density grid when training. I get bad result when I reproduce the paper (I use density grid code in torch-ngp).

Artifacts around hair in the reproduced results

Hi, thanks for your work! I tried to reproduce the results and reimplement the training code of NeRFBlendshape. Currently, the rendering quality on the validation set is overall similar to the released model:

good

But there are some artifacts around the hair, especially on the side head around the ear, shown in the figure below:

bad

I think the results are somewhat unexpected since the rendering quality is good in other regions, including some facial hair like mustache and eyebrow. I wonder whether you have met this problem during training. Can you provide some suggestions to avoid this artifact? Thanks a lot!

Here are my training schedules:

  1. 0~7 epoch: L1 color loss is only applied to randomly sampled pixels, with weight 1;
  2. 7~15 epoch: 0.5 prob to sample 32x32 patch and 0.5 prob to sample random pixels; when sample random pixels, L1 color loss is applied, with weight 1; when sample random patch, L1 color loss and LPIPS (vgg backbone) are applied, with both weight 0.1. When sample patch, 0.5 prob to sample around the mouth, and 0.5 prob for uniform sample in the image;
  3. the batch size is set to 1, i.e., only sample patch or random pixels in one image.

Another question is about the dataset. I find that there are N+1 frames in the provided mp4, but there are only N annotations in the json files. Is the 0th annotation corresponds to the 0th frame or the 1st frame?

question about the ExpHashEncoder implementation

Hi, thanks for solid your work and code!

I wonder why not directly use the hash encoder provided in the tinycudann lib in the implementation? Are there some special concerns? Is the hash encoder provided by the tinycudann lib not compatible with the cuda accelerated ray marching?

Thanks!

question about your evaluation

Hello, thanks for your great work and for making code available.

I have a question about the results of table 2 in your paper. Are those table metrics computed on the 8 identities that you make publicly available here?

Also, do you compute the metrics with white background or the static background extracted from the images?

TypeError: _composite_raysBackward.forward: expected Tensor or tuple of Tensor (got NoneType) for return value 0

Thank you for your wonderful work!
I ran the inference code and got this error.

depth, image,exps_code = _run(rays_o, rays_d,exps,exp_ori, bound, num_steps, bg_color) 
File "....../NeRFBlendShape/nerf/renderer.py", line 130, in run_cuda
raymarching.composite_rays(n_alive, n_step, rays_alive[i % 2], rays_t[i % 2], sigmas, rgbs, deltas, weights_sum, depth, image)
TypeError: _composite_raysBackward.forward: expected Tensor or tuple of Tensor (got NoneType) for return value 0

def forward(ctx, n_alive, n_step, rays_alive, rays_t, sigmas, rgbs, deltas, weights, depth, image):

It seems that the "forward" function lacks "return XXX". Could you give me some suggestions?Thank you so much!!!

about code releasing

Great work! I wonder when you will release the code, I think it can inspire more future works!

meaning of min_46 max_46.txt

Hi
thanks for your great work!
Is it the range of expression coefficients? and how do you get it ?
Is it necessary for training or inference?

Question about Equ. 9 of the paper

Hi! I am confused of the Equ. 9 of the paper, where the equation is about how to update the density grid.
According to the equation, for each expression basis, we compute a new multi-resolution hashtable h^i. My question is: what is the relationship between the newly computed hashtable h^i and the density grid? Are they equivalent? If not, how to update the density grid according to h^i?
Looking forward to your reply!

Advice on generating own tracking data

Hi,

First of all, amazing work!

I had questions about what are max_per and min_per doing in the code and how can we calulate them? And are expression coefficients the same as expression parameters obtained from a 3DMM model? I basically want to know how this combination is taking place and how can I generate my own data for it using a 3DMM tracker.

Thanks

TypeError: 'module' object is not callable

Thanks for your great work!
I ran the inference code, and got the issue:
image

it seems the error from the function: compact_rays
Could you give me some suggestions? Thank you so much!

New expression

When you input a new expression does the model have to be re-trained everytime? I see that you concatenate the queried feature with the expression code so it seems like every time there's a new expression code we have to train the model again. How is inference handled in this model? Very nice paper btw with what looks like great results!

about basis num

Hello, thank you for code releasing. I have a question about the basis number.
The dimension of the expression coefficient is set to 46 according to the paper. But in the code, it is set to 17. I wonder why the dimension is smaller than that of the paper described. Hoping for your answer!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.