Giter Site home page Giter Site logo

zju3dv / enerf Goto Github PK

View Code? Open in Web Editor NEW
402.0 22.0 27.0 3.61 MB

SIGGRAPH Asia 2022: Code for "Efficient Neural Radiance Fields for Interactive Free-viewpoint Video"

Home Page: https://zju3dv.github.io/enerf

License: Other

Python 100.00%
4d-reconstruction dynamic-view-synthesis novel-view-synthesis siggraph-asia-2022

enerf's Introduction

News

  • 02/12/2023 We release ENeRF object-compositional representation code including training and visualization for ENeRF-Outdoor dataset.
  • 01/10/2023 We release ENeRF-Outdoor dataset.

ENeRF: Efficient Neural Radiance Fields for Interactive Free-viewpoint Video

Efficient Neural Radiance Fields for Interactive Free-viewpoint Video
Haotong Lin*, Sida Peng*, Zhen Xu, Yunzhi Yan, Qing Shuai, Hujun Bao and Xiaowei Zhou
SIGGRAPH Asia 2022 conference track
Project Page

Installation

Set up the python environment

conda create -n enerf python=3.8
conda activate enerf
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html # Important!
pip install -r requirements.txt

Set up datasets

0. Set up workspace

The workspace is the disk directory that stores datasets, training logs, checkpoints and results. Please ensure it has enough space.

export workspace=$PATH_TO_YOUR_WORKSPACE

1. Pre-trained model

Download the pretrained model from dtu_pretrain (Pretrained on DTU dataset.)

Put it into $workspace/trained_model/enerf/dtu_pretrain/latest.pth.

2. DTU

Download the preprocessed DTU training data and Depth_raw from original MVSNet repo and unzip. MVSNeRF provide a DTU example, please follow with the example's folder structure.

mv dtu_example.zip $workspace
cd $workspace
unzip dtu_example.zip

This script only shows the example directory structure. You should download all the scenes in the DTU dataset and organize the data according to the example directory structure. Otherwise you can only do evaluation and fine-tuning on the example data.

2. NeRF Synthetic and Real Forward-facing

Download the NeRF Synthetic and Real Forward-facing datasets from NeRF and unzip them to $workspace. You should have the following directory.

$workspace/nerf_llff_data
$workspace/nerf_synthetic

3. ZJU-MoCap

Download the ZJU-MoCap dataset from NeuralBody. Put it into $workspace/zju_mocap/CoreView_313.

4. ENeRF-Outdoor

Download the ENeRF-Outdoor dataset from this link. Put it into $workspace/enerf_outdoor/actor1.

Training and fine-tuning

Training

Use the following command to train a generalizable model on DTU.

python train_net.py --cfg_file configs/enerf/dtu_pretrain.yaml 

Our code also supports multi-gpu training. The published pretrained model was trained for 138000 iterations with 4 GPUs.

python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/enerf/dtu_pretrain.yaml distributed True gpus 0,1,2,3

Fine-tuning

cd $workspace/trained_model/enerf
mkdir dtu_ft_scan114
cp dtu_pretrain/138.pth dtu_ft_scan114
cd $codespace # codespace is the directory of the ENeRF code
python train_net.py --cfg_file configs/enerf/dtu/scan114.yaml

Fine-tuning for 3000 and 11000 iterations takes about 11 minutes and 40 minutes, respectively, on our test machine ( i9-12900K CPU, RTX 3090 GPU).

Fine-tuning on the ZJU-MoCap dataset

python train_net.py --cfg_file configs/enerf/zjumocap/zjumocap_train.yaml

Training on the ENeRF-Outdoor dataset (from scratch)

python train_net.py --cfg_file configs/enerf/enerf_outdoor/actor1.yaml

Evaluation

Evaluate the pretrained model on DTU

Use the following command to evaluate the pretrained model on DTU.

python run.py --type evaluate --cfg_file configs/enerf/dtu_pretrain.yaml enerf.cas_config.render_if False,True enerf.cas_config.volume_planes 48,8 enerf.eval_depth True
{'psnr': 27.60513418439332, 'ssim': 0.9570619, 'lpips': 0.08897018397692591}
{'abs': 4.2624497, 'acc_2': 0.8003020328362158, 'acc_10': 0.9279663826227568}
{'mvs_abs': 4.4139433, 'mvs_acc_2': 0.7711405202036934, 'mvs_acc_10': 0.9262374398033109}
FPS:  21.778975517304048

21.8 FPS@512x640 is tested on a desktop with an Intel i9-12900K CPU and an RTX 3090 GPU. Add the "save_result True" parameter at the end of the command to save the rendering result.

Evaluate the pretrained model on LLFF and NeRF datasets

python run.py --type evaluate --cfg_file configs/enerf/nerf_eval.yaml
python run.py --type evaluate --cfg_file configs/enerf/llff_eval.yaml

Evaluate the pretrained model on ZJU-MoCap dataset.

python run.py --type evaluate --cfg_file configs/enerf/zjumocap_eval.yaml
==============================
CoreView_313_level1 psnr: 31.48 ssim: 0.971 lpips:0.042
{'psnr': 31.477305846323087, 'ssim': 0.9714806, 'lpips': 0.04184799361974001}
==============================
FPS:  49.24468263992353

Visualization for ENeRF-Outdoor dataset.

python run.py --type visualize --cfg_file configs/enerf/enerf_outdoor/actor1_path.yaml

Interactive Rendering

We release the interactive rendering GUI for ZJU-MoCap dataset.

python gui_human.py --cfg_file configs/enerf/interactive/zjumocap.yaml
Usage:

Mouse wheel:          Zoom in/out
Mouse left button:    Move
Mouse right button:   Rotate
Keyboard a:           Align #  Hold down a and then use the mouse right button to rotate the object for a good rendering trajectory
Keyboard s:           Snap

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{lin2022enerf,
  title={Efficient Neural Radiance Fields for Interactive Free-viewpoint Video},
  author={Lin, Haotong and Peng, Sida and Xu, Zhen and Yan, Yunzhi and Shuai, Qing and Bao, Hujun and Zhou, Xiaowei},
  booktitle={SIGGRAPH Asia Conference Proceedings},
  year={2022}
}

enerf's People

Contributors

haotongl avatar pengsida avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

enerf's Issues

Dataset access

Dear author, hi~
Could I ask for the agreement form of the dataset download, I found that the link to download the agreement form you provided earlier is no longer valid.

Strange error while funetunning zjumocap

I want to finetune with zjumocap 313, just like what the config does. But met the following issue.
Is it caused by windows?

btw I modified some lines in

os.system('mkdir -p {}'.format(model_dir))

to something like os.makedirs(model_dir, exist_ok=True), which will create "-p" folder under the workspace in windows.

Error logs:

(smpl-py38-torch110-cu111) PS D:\workspace4tian\ENeRF> python train_net.py --cfg_file configs/enerf/zjumocap/zjumocap_train.yaml
Workspace:  D:\workspace4tian\ENeRF
configs/enerf/dtu_pretrain.yaml
configs/enerf/zjumocap/zjumocap_train.yaml
EXP NAME:  zjumocap
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Loading model from: D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\lpips\weights\v0.1\vgg.pth
子目录或文件 D:\workspace4tian\ENeRF\result\enerf\zjumocap\default 已经存在。
处理: D:\workspace4tian\ENeRF\result\enerf\zjumocap\default 时出错。
Load pretrain model: D:\workspace4tian\ENeRF\trained_model\enerf\dtu_pretrain\latest.pth
Traceback (most recent call last):
  File "train_net.py", line 117, in <module>
    main()
    w.start()
  File "C:\Program Files\Python38\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Program Files\Python38\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Program Files\Python38\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "C:\Program Files\Python38\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Program Files\Python38\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class 'lib.datasets.zjumocap.enerf.Dataset'>: it's not the same object as lib.datasets.zjumocap.enerf.Dataset
Workspace:  D:\workspace4tian\ENeRF
configs/enerf/dtu_pretrain.yaml
configs/enerf/zjumocap/zjumocap_train.yaml
EXP NAME:  zjumocap
(smpl-py38-torch110-cu111) PS D:\workspace4tian\ENeRF> Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

Coordinate transformation problem

I'm not quite clear about the calculation of self.XYZ in enerf_interactivate.py, could you please help explain? I know it's roughly pixel coordinates to camera coordinates, but it's a little confusing why I need to invert the matrix and then transpose.

Error in DTU Eval

When using train_net.py to train, an error occurs when performing eval:
cv2.error:resize.cpp:4062:erroe:(-215:Assert failed) !ssize.empty() in function 'resize'

Locating errors 90-93 in dtu/enerf.py
tar_dpt = data_utils.read_pfm(scene_info['dpt_paths'][tar_view])[0].astype(np.float32) tar_dpt0(128, 160)
tar_dpt = cv2.resize(tar_dpt, None, fx=0.5, fy=0.5, interpolation=cv2.INTER_NEAREST) tar_dpt1(64, 80)
tar_dpt = tar_dpt[44:556, 80:720] tar_dpt2(20, 0)
tar_mask = (tar_dpt > 0.).astype(np.uint8) tar_mask(20, 0)

After spliting, tar_mask is empty
how to solve it?

resolution ratio of input image

Hi, it seems a little blurry when using your gui_human.py to visualize the results. Does the resolution ratio (input_ratio in the yaml) that cause the problem? Will the result seems much clearer if the parameter set to 1.0 for training and inference? Thank you!

custom outdoor dataset

Hi! About building a custom outdoor dataset like yours, could you please give some suggestions about how to get my own 'background.ply'? Also, for the 3D bounding box in 'vhull', is there any faster method than those mentioned in Easymocap? Thank you for your help!

Question about the mask_util

Hi, Thanks for your great work.
However, after reading your paper and code, I notice that there is a mask_util file include the ade20k label that did not mentioned in your paper
I am just curious about the meaning of this file?
Is this related to your future work or can I just ignore it?

如何使用ENeRF训练自己的数据集

尊敬的前辈:
您好!最近拜读了贵学校的文章ENeRF,其是一项伟大的工作,对我很有启发。在复现了您的部分工作后,我想尝试使用自己的数据集在ENeRF上运行,但是出现了一些问题。我想请问一下我是否有机会得到您处理从colmap SfM result转换到 ENeRF input的方法。

RuntimeError: CUDA out of memory.

Hi, thanks for sharing your great work!

I am trying to run the evaluation on scan114 only (have not had the space to download the other datasets yet). However, I have encountered a CUDA out of memory runtime error as shown, after running the command python run.py --type evaluate --cfg_file configs/enerf/dtu_pretrain.yaml enerf.cas_config.render_if False,True enerf.cas_config.volume_planes 48,8 enerf.eval_depth True:

load model: /home/ENeRF-master/trained_model/enerf/dtu_pretrain/latest.pth
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /home/anaconda3/lib/python3.9/site-packages/lpips/weights/v0.1/vgg.pth
  0%|                                                                                                                                                                                        | 0/4 [00:03<?, ?it/s]
Traceback (most recent call last):
  File "/home/ENeRF-master/run.py", line 111, in <module>
    globals()['run_' + args.type]()
  File "/home/ENeRF-master/run.py", line 70, in run_evaluate
    output = network(batch)
  File "/home/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "lib/networks/enerf/network.py", line 96, in forward
    ret_i = self.batchify_rays(
  File "lib/networks/enerf/network.py", line 49, in batchify_rays
    ret = self.render_rays(rays[:, i:i + chunk], **kwargs)
  File "lib/networks/enerf/network.py", line 40, in render_rays
    net_output = nerf_model(vox_feat, img_feat_rgb_dir)
  File "/home/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ENeRF-master/lib/networks/enerf/nerf.py", line 40, in forward
    x = torch.cat((x, img_feat_rgb_dir), dim=-1)
RuntimeError: CUDA out of memory. Tried to allocate 774.00 MiB (GPU 0; 23.70 GiB total capacity; 1.13 GiB already allocated; 321.56 MiB free; 1.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I have tried to include os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:512" at the beginning of the run.py file, however, I have received the exact same error. Any suggestion on how I should resolve this error?

Thank you!

AttributeError: frames

Great work! I met a problem when I using the flower in llff dataset to run gui_human.py with parameters '--cfg_file configs/enerf/llff/flower.yaml'.
The error is:
Traceback (most recent call last):
File "D:\pycharm\ENeRF\gui_human.py", line 380, in
main()
File "D:\pycharm\ENeRF\gui_human.py", line 231, in main
rend = Renderer() # prepare network and dataloader
File "D:\pycharm\ENeRF\gui_human.py", line 50, in init
self.frame_start = cfg.test_dataset.frames[0]
File "D:\pycharm\ENeRF\lib\config\yacs.py", line 115, in getattr
raise AttributeError(name)
AttributeError: frames
It seems like it don't have a 'frame' in test_dataset
image
Could you please help me with this problem? Thanks in advance!

Great Work!

The ENerf is the first to achieve real-time photorealistic rendering of arbitrary dynamic scenes, which will greatly promote future scientific research. And it also has great application value in the future life!

Doubt with homo_warp

Hi,

I was working with your code and when reviewing the projection of the feature maps into the cost volume, there's something that I don't understand. In the function homo_warp, the projection matrix that is computed is to go from the target view camera coordinates to the source view in order to interpolate the features:

def get_proj_mats(batch, src_scale, tar_scale):
    B, S_V, C, H, W = batch['src_inps'].shape
    src_ext = batch['src_exts']
    src_ixt = batch['src_ixts'].clone()
    src_ixt[:, :, :2] *= src_scale
    src_projs = src_ixt @ src_ext[:, :, :3]

    tar_ext = batch['tar_ext']
    tar_ixt = batch['tar_ixt'].clone()
    tar_ixt[:, :2] *= tar_scale
    tar_projs = tar_ixt @ tar_ext[:, :3]
    tar_ones = torch.zeros((B, 1, 4)).to(tar_projs.device)
    tar_ones[:, :, 3] = 1
    tar_projs = torch.cat((tar_projs, tar_ones), dim=1)
    tar_projs_inv = torch.inverse(tar_projs)

    src_projs = src_projs.view(B, S_V, 3, 4)
    tar_projs_inv = tar_projs_inv.view(B, 1, 4, 4)

    proj_mats = src_projs @ tar_projs_inv
    return proj_mats

But when projecting the grid into the image, I don't understand which coordinates are used. Only pixel indices seemed to be used and are projected into the source image by slicing the projection matrix into rotation and translation, when it also contains the intrinsic matrix:

def homo_warp(src_feat, proj_mat, depth_values, batch):
    B, D, H_T, W_T = depth_values.shape
    C, H_S, W_S = src_feat.shape[1:]
    device = src_feat.device

    R = proj_mat[:, :, :3] # (B, 3, 3)
    T = proj_mat[:, :, 3:] # (B, 3, 1)
    # create grid from the ref frame
    ref_grid = create_meshgrid(H_T, W_T, normalized_coordinates=False,
                               device=device) # (1, H, W, 2)
    ref_grid = ref_grid.permute(0, 3, 1, 2) # (1, 2, H, W)
    ref_grid = ref_grid.reshape(1, 2, H_T*W_T) # (1, 2, H*W)
    ref_grid = ref_grid.expand(B, -1, -1) # (B, 2, H*W)
    ref_grid = torch.cat((ref_grid, torch.ones_like(ref_grid[:,:1])), 1) # (B, 3, H*W)
    ref_grid_d = ref_grid.repeat(1, 1, D) # (B, 3, D*H*W)
    src_grid_d = R @ ref_grid_d + T/depth_values.view(B, 1, D*H_T*W_T)
    del ref_grid_d, ref_grid, proj_mat, R, T, depth_values # release (GPU) memory

    # project negative depth pixels to somewhere outside the image
    # negative_depth_mask = src_grid_d[:, 2:] <= 1e-7
    # src_grid_d[:, 0:1][negative_depth_mask] = W
    # src_grid_d[:, 1:2][negative_depth_mask] = H
    # src_grid_d[:, 2:3][negative_depth_mask] = 1

    src_grid = src_grid_d[:, :2] / torch.clamp_min(src_grid_d[:, 2:], 1e-6) # divide by depth (B, 2, D*H*W)
    # del src_grid_d
    src_grid[:, 0] = (src_grid[:, 0])/((W_S - 1) / 2) - 1 # scale to -1~1
    src_grid[:, 1] = (src_grid[:, 1])/((H_S - 1) / 2) - 1 # scale to -1~1
    src_grid = src_grid.permute(0, 2, 1) # (B, D*H*W, 2)
    src_grid = src_grid.view(B, D, H_T*W_T, 2)

    warped_src_feat = F.grid_sample(src_feat, src_grid,
                                    mode='bilinear', padding_mode='zeros',
                                    align_corners=True) # (B, C, D, H*W)
    warped_src_feat = warped_src_feat.view(B, C, D, H_T, W_T)
    src_grid = src_grid.view(B, D, H_T, W_T, 2)
    if torch.isnan(warped_src_feat).isnan().any():
        __import__('ipdb').set_trace()
    return warped_src_feat, src_grid

Could you explain how are the coordinates from the grid projected into the source image and in which coordinate system is the grid defined?

Thanks in advance,
Sergio

Question about config file

Hi, Thanks for your great work! During training, I find the code pass through the function name build_feature_volume twice, is this means coarse to fine? And I wondering how to change the parameters in cfg.enerf.cas_config, I cannot find the related parameters in lib.config. Looking forward to your reply!
Best wishes!

How to make bounding box data?

Hi, I want to train ENeRF with my own data.

My own data constructed on 3*7 fixed grid multi-view cameras.

In this moment, I have to make bounding box data (*.npy) for my own data to train on ENeRF-Outdoor dataset setting.

Please let me know how to make the bounding box
(or is there any code for make bounding box?)

No such file or directory: 'lib/visualizers/enerf.py'

Hi, thank you for your hard work. I have been trying to run the visualize module, but maybe it is not published yet? It says No such file or directory: 'lib/visualizers/enerf.py' or maybe I am doing something run.

I was running using the following command python run.py --type visualize --cfg_file configs/enerf/llff_eval.yaml

Looking forward to your release of the visualize, interactive rendering code and outdoor dataset!

Thank you

关于zjumocap_train.yaml文件下某些项的作用

我正在尝试调整这个文件中的一些数值以尝试微调预训练模型。在其中我产生了一些疑问,请问enerf项下的train_input_views_prob的作用,我现在的理解是train_dataset项下的input_views控制了输入的训练视角的数量,test_dataset项下则是控制参与评估的数量。

我在finetune时使用了如下设置
train_dataset:
data_root: 'zju_mocap'
scene: 'CoreView_test4'
split: train
frames: [0, 599, 1]
input_views: [0, -1, 1]
render_views: [0, -1, 1]
input_ratio: 0.5

test_dataset:
data_root: 'zju_mocap'
scene: 'CoreView_test4'
split: test
frames: [0, 600, 100]
input_views: [0, -1, 2]
render_views: [1, -1, 2]
input_ratio: 0.5
(其他的部分与原文件相同)
另外,我在用我自己的类似zju-mocap数据集(6个同步摄像头)进行训练的时候,有一个很奇怪的现象是,训练后的psnr,ssim,lpips等值变得比之前更优了,但是从gui中渲染出来的效果却没有使用预训练模型时候的渲染效果清晰,请问这是什么原因呢?
image
image
image
image

Only 13 FPS?

I rerun your algorithm on LLFF dataset of resolution 512*512, but only get about 13 FPS. How to achieve 25 FPS mentioned in your paper?

different result of standing and sitting person

Hi, I built two datasets, one for standing peroson and one for sitting person. The training result of standing data is much better than sitting data (about 4 PSNR). I noticed that someone said openpose achieve better result with standing person than sitting person.
zju3dv/EasyMocap#94
Is that the problem cause enerf achieve different results? Thank you!

How to run ENeRF on my own data?

Hi, the real-time dynamic rendering demo on your project page is so cool! And I want to make the same thing on my own data (sequence of images). What should I do?

How to handle our own data, can you provide a tutorial?

Hello, I want to use my own data for training and rendering, how do I process my own data? In addition, in the video on your project page, can you provide interfaces and data sets for testing? I would be very grateful if a tutorial could be provided on this.

Results on ENeRF-Outdoor dataset and poor quality depth

Hi, thanks for the great work!
However, after running your training script (python train_net.py --cfg_file configs/enerf/enerf_outdoor/actor1.yaml) on Actor1 for 50 epochs, I am getting the following results. The results for the color prediction are not as good as advertised on your project page, with lots of warping of the background. Also, the depth maps are quite poor, with the depth of the shadow region being incorrectly predicted. Do you know why this might be?

actor1_0800_0_800

Color:
https://user-images.githubusercontent.com/9107279/219429442-24e2cc1d-bb5b-4d78-9f58-588e318fdbaa.mp4

Depth:
https://user-images.githubusercontent.com/9107279/219429583-eccd8139-173f-4a6c-b4c6-26d0e83e5db9.mp4

Visualization error in ENeRF-Outdoor dataset

Hi, thanks for the great work!

I tried to visualize ENeRF-Outdoor dataset by executing the code below.(Use pretrain model provided to you)

python run.py --type visualize --cfg_file configs/enerf/enerf_outdoor/actor1_path.yaml

However, the error below occurred, and when I looked for the solution, I guess it is because the pretrained model model was learned using multi-gpu and i loaded it on the single GPU.

Therefore, I would like to ask for an answer to the following question.

  1. Please tell me how to visualize the ENeRF-Outdoor dataset using the pretrained model you provided on the single GPU.
  • The ZJU-MoCap Interactive Rendering, ZJU-MoCap / LLFF / NeRF / DTU dataasets Evaluation you provided works normally on my single GPU.
  1. I would like to request the ENeRF-Outdoor / ST-NeRF dataset interactive rendering code that can be found on your project page.
    I think it has already been developed, so I sincerely ask you to share it.

-----------------------------------ERROR Detail--------------------------------------

EXP NAME: actor1

load model: /home/ubuntu/ENeRF/trained_model/enerf/actor1/latest.pth
Traceback (most recent call last):
File "run.py", line 106, in
globals()'run_' + args.type
File "run.py", line 90, in run_visualize
load_network(network,
File "/home/ubuntu/ENeRF/lib/utils/net_utils.py", line 443, in load_network
net.load_state_dict(pretrained_model['net'], strict=strict)
File "/home/ubuntu/.conda/envs/enerf/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Network:
Missing key(s) in state_dict: "feature_net_bg.conv0.0.conv.weight", "feature_net_bg.conv0.0.bn.weight", "feature_net_bg.conv0.0.bn.bias", "feature_net_bg.conv0.0.bn.running_mean", "feature_net_bg.conv0.0.bn.running_var", "feature_net_bg.conv0.1.conv.weight", "feature_net_bg.conv0.1.bn.weight", "feature_net_bg.conv0.1.bn.bias", "feature_net_bg.conv0.1.bn.running_mean", "feature_net_bg.conv0.1.bn.running_var", "feature_net_bg.conv1.0.conv.weight", "feature_net_bg.conv1.0.bn.weight", "feature_net_bg.conv1.0.bn.bias", "feature_net_bg.conv1.0.bn.running_mean", "feature_net_bg.conv1.0.bn.running_var", "feature_net_bg.conv1.1.conv.weight", "feature_net_bg.conv1.1.bn.weight", "feature_net_bg.conv1.1.bn.bias", "feature_net_bg.conv1.1.bn.running_mean", "feature_net_bg.conv1.1.bn.running_var", "feature_net_bg.conv2.0.conv.weight", "feature_net_bg.conv2.0.bn.weight", "feature_net_bg.conv2.0.bn.bias", "feature_net_bg.conv2.0.bn.running_mean", "feature_net_bg.conv2.0.bn.running_var", "feature_net_bg.conv2.1.conv.weight", "feature_net_bg.conv2.1.bn.weight", "feature_net_bg.conv2.1.bn.bias", "feature_net_bg.conv2.1.bn.running_mean", "feature_net_bg.conv2.1.bn.running_var", "feature_net_bg.toplayer.weight", "feature_net_bg.toplayer.bias", "feature_net_bg.lat1.weight", "feature_net_bg.lat1.bias", "feature_net_bg.lat0.weight", "feature_net_bg.lat0.bias", "feature_net_bg.smooth1.weight", "feature_net_bg.smooth1.bias", "feature_net_bg.smooth0.weight", "feature_net_bg.smooth0.bias", "cost_reg_0_layer0.conv0.conv.weight", "cost_reg_0_layer0.conv0.bn.weight", "cost_reg_0_layer0.conv0.bn.bias", "cost_reg_0_layer0.conv0.bn.running_mean", "cost_reg_0_layer0.conv0.bn.running_var", "cost_reg_0_layer0.conv1.conv.weight", "cost_reg_0_layer0.conv1.bn.weight", "cost_reg_0_layer0.conv1.bn.bias", "cost_reg_0_layer0.conv1.bn.running_mean", "cost_reg_0_layer0.conv1.bn.running_var", "cost_reg_0_layer0.conv2.conv.weight", "cost_reg_0_layer0.conv2.bn.weight", "cost_reg_0_layer0.conv2.bn.bias", "cost_reg_0_layer0.conv2.bn.running_mean", "cost_reg_0_layer0.conv2.bn.running_var", "cost_reg_0_layer0.conv3.conv.weight", "cost_reg_0_layer0.conv3.bn.weight", "cost_reg_0_layer0.conv3.bn.bias", "cost_reg_0_layer0.conv3.bn.running_mean", "cost_reg_0_layer0.conv3.bn.running_var", "cost_reg_0_layer0.conv4.conv.weight", "cost_reg_0_layer0.conv4.bn.weight", "cost_reg_0_layer0.conv4.bn.bias", "cost_reg_0_layer0.conv4.bn.running_mean", "cost_reg_0_layer0.conv4.bn.running_var", "cost_reg_0_layer0.conv9.0.weight", "cost_reg_0_layer0.conv9.1.weight", "cost_reg_0_layer0.conv9.1.bias", "cost_reg_0_layer0.conv9.1.running_mean", "cost_reg_0_layer0.conv9.1.running_var", "cost_reg_0_layer0.conv11.0.weight", "cost_reg_0_layer0.conv11.1.weight", "cost_reg_0_layer0.conv11.1.bias", "cost_reg_0_layer0.conv11.1.running_mean", "cost_reg_0_layer0.conv11.1.running_var", "cost_reg_0_layer0.depth_conv.0.weight", "cost_reg_0_layer0.feat_conv.0.weight", "nerf_0_layer0.agg.global_fc.0.weight", "nerf_0_layer0.agg.global_fc.0.bias", "nerf_0_layer0.agg.agg_w_fc.0.weight", "nerf_0_layer0.agg.agg_w_fc.0.bias", "nerf_0_layer0.agg.fc.0.weight", "nerf_0_layer0.agg.fc.0.bias", "nerf_0_layer0.lr0.0.weight", "nerf_0_layer0.lr0.0.bias", "nerf_0_layer0.sigma.0.weight", "nerf_0_layer0.sigma.0.bias", "nerf_0_layer0.color.0.weight", "nerf_0_layer0.color.0.bias", "nerf_0_layer0.color.2.weight", "nerf_0_layer0.color.2.bias", "cost_reg_0_bg.conv0.conv.weight", "cost_reg_0_bg.conv0.bn.weight", "cost_reg_0_bg.conv0.bn.bias", "cost_reg_0_bg.conv0.bn.running_mean", "cost_reg_0_bg.conv0.bn.running_var", "cost_reg_0_bg.conv1.conv.weight", "cost_reg_0_bg.conv1.bn.weight", "cost_reg_0_bg.conv1.bn.bias", "cost_reg_0_bg.conv1.bn.running_mean", "cost_reg_0_bg.conv1.bn.running_var", "cost_reg_0_bg.conv2.conv.weight", "cost_reg_0_bg.conv2.bn.weight", "cost_reg_0_bg.conv2.bn.bias", "cost_reg_0_bg.conv2.bn.running_mean", "cost_reg_0_bg.conv2.bn.running_var", "cost_reg_0_bg.conv3.conv.weight", "cost_reg_0_bg.conv3.bn.weight", "cost_reg_0_bg.conv3.bn.bias", "cost_reg_0_bg.conv3.bn.running_mean", "cost_reg_0_bg.conv3.bn.running_var", "cost_reg_0_bg.conv4.conv.weight", "cost_reg_0_bg.conv4.bn.weight", "cost_reg_0_bg.conv4.bn.bias", "cost_reg_0_bg.conv4.bn.running_mean", "cost_reg_0_bg.conv4.bn.running_var", "cost_reg_0_bg.conv9.0.weight", "cost_reg_0_bg.conv9.1.weight", "cost_reg_0_bg.conv9.1.bias", "cost_reg_0_bg.conv9.1.running_mean", "cost_reg_0_bg.conv9.1.running_var", "cost_reg_0_bg.conv11.0.weight", "cost_reg_0_bg.conv11.1.weight", "cost_reg_0_bg.conv11.1.bias", "cost_reg_0_bg.conv11.1.running_mean", "cost_reg_0_bg.conv11.1.running_var", "cost_reg_0_bg.depth_conv.0.weight", "cost_reg_0_bg.feat_conv.0.weight", "nerf_0_bg.agg.global_fc.0.weight", "nerf_0_bg.agg.global_fc.0.bias", "nerf_0_bg.agg.agg_w_fc.0.weight", "nerf_0_bg.agg.agg_w_fc.0.bias", "nerf_0_bg.agg.fc.0.weight", "nerf_0_bg.agg.fc.0.bias", "nerf_0_bg.lr0.0.weight", "nerf_0_bg.lr0.0.bias", "nerf_0_bg.sigma.0.weight", "nerf_0_bg.sigma.0.bias", "nerf_0_bg.color.0.weight", "nerf_0_bg.color.0.bias", "nerf_0_bg.color.2.weight", "nerf_0_bg.color.2.bias", "cost_reg_1_layer0.conv0.conv.weight", "cost_reg_1_layer0.conv0.bn.weight", "cost_reg_1_layer0.conv0.bn.bias", "cost_reg_1_layer0.conv0.bn.running_mean", "cost_reg_1_layer0.conv0.bn.running_var", "cost_reg_1_layer0.conv1.conv.weight", "cost_reg_1_layer0.conv1.bn.weight", "cost_reg_1_layer0.conv1.bn.bias", "cost_reg_1_layer0.conv1.bn.running_mean", "cost_reg_1_layer0.conv1.bn.running_var", "cost_reg_1_layer0.conv2.conv.weight", "cost_reg_1_layer0.conv2.bn.weight", "cost_reg_1_layer0.conv2.bn.bias", "cost_reg_1_layer0.conv2.bn.running_mean", "cost_reg_1_layer0.conv2.bn.running_var", "cost_reg_1_layer0.conv3.conv.weight", "cost_reg_1_layer0.conv3.bn.weight", "cost_reg_1_layer0.conv3.bn.bias", "cost_reg_1_layer0.conv3.bn.running_mean", "cost_reg_1_layer0.conv3.bn.running_var", "cost_reg_1_layer0.conv4.conv.weight", "cost_reg_1_layer0.conv4.bn.weight", "cost_reg_1_layer0.conv4.bn.bias", "cost_reg_1_layer0.conv4.bn.running_mean", "cost_reg_1_layer0.conv4.bn.running_var", "cost_reg_1_layer0.conv9.0.weight", "cost_reg_1_layer0.conv9.1.weight", "cost_reg_1_layer0.conv9.1.bias", "cost_reg_1_layer0.conv9.1.running_mean", "cost_reg_1_layer0.conv9.1.running_var", "cost_reg_1_layer0.conv11.0.weight", "cost_reg_1_layer0.conv11.1.weight", "cost_reg_1_layer0.conv11.1.bias", "cost_reg_1_layer0.conv11.1.running_mean", "cost_reg_1_layer0.conv11.1.running_var", "cost_reg_1_layer0.depth_conv.0.weight", "cost_reg_1_layer0.feat_conv.0.weight", "nerf_1_layer0.agg.global_fc.0.weight", "nerf_1_layer0.agg.global_fc.0.bias", "nerf_1_layer0.agg.agg_w_fc.0.weight", "nerf_1_layer0.agg.agg_w_fc.0.bias", "nerf_1_layer0.agg.fc.0.weight", "nerf_1_layer0.agg.fc.0.bias", "nerf_1_layer0.lr0.0.weight", "nerf_1_layer0.lr0.0.bias", "nerf_1_layer0.sigma.0.weight", "nerf_1_layer0.sigma.0.bias", "nerf_1_layer0.color.0.weight", "nerf_1_layer0.color.0.bias", "nerf_1_layer0.color.2.weight", "nerf_1_layer0.color.2.bias", "cost_reg_1_bg.conv0.conv.weight", "cost_reg_1_bg.conv0.bn.weight", "cost_reg_1_bg.conv0.bn.bias", "cost_reg_1_bg.conv0.bn.running_mean", "cost_reg_1_bg.conv0.bn.running_var", "cost_reg_1_bg.conv1.conv.weight", "cost_reg_1_bg.conv1.bn.weight", "cost_reg_1_bg.conv1.bn.bias", "cost_reg_1_bg.conv1.bn.running_mean", "cost_reg_1_bg.conv1.bn.running_var", "cost_reg_1_bg.conv2.conv.weight", "cost_reg_1_bg.conv2.bn.weight", "cost_reg_1_bg.conv2.bn.bias", "cost_reg_1_bg.conv2.bn.running_mean", "cost_reg_1_bg.conv2.bn.running_var", "cost_reg_1_bg.conv3.conv.weight", "cost_reg_1_bg.conv3.bn.weight", "cost_reg_1_bg.conv3.bn.bias", "cost_reg_1_bg.conv3.bn.running_mean", "cost_reg_1_bg.conv3.bn.running_var", "cost_reg_1_bg.conv4.conv.weight", "cost_reg_1_bg.conv4.bn.weight", "cost_reg_1_bg.conv4.bn.bias", "cost_reg_1_bg.conv4.bn.running_mean", "cost_reg_1_bg.conv4.bn.running_var", "cost_reg_1_bg.conv9.0.weight", "cost_reg_1_bg.conv9.1.weight", "cost_reg_1_bg.conv9.1.bias", "cost_reg_1_bg.conv9.1.running_mean", "cost_reg_1_bg.conv9.1.running_var", "cost_reg_1_bg.conv11.0.weight", "cost_reg_1_bg.conv11.1.weight", "cost_reg_1_bg.conv11.1.bias", "cost_reg_1_bg.conv11.1.running_mean", "cost_reg_1_bg.conv11.1.running_var", "cost_reg_1_bg.depth_conv.0.weight", "cost_reg_1_bg.feat_conv.0.weight", "nerf_1_bg.agg.global_fc.0.weight", "nerf_1_bg.agg.global_fc.0.bias", "nerf_1_bg.agg.agg_w_fc.0.weight", "nerf_1_bg.agg.agg_w_fc.0.bias", "nerf_1_bg.agg.fc.0.weight", "nerf_1_bg.agg.fc.0.bias", "nerf_1_bg.lr0.0.weight", "nerf_1_bg.lr0.0.bias", "nerf_1_bg.sigma.0.weight", "nerf_1_bg.sigma.0.bias", "nerf_1_bg.color.0.weight", "nerf_1_bg.color.0.bias", "nerf_1_bg.color.2.weight", "nerf_1_bg.color.2.bias".
Unexpected key(s) in state_dict: "cost_reg_0.conv0.conv.weight", "cost_reg_0.conv0.bn.weight", "cost_reg_0.conv0.bn.bias", "cost_reg_0.conv0.bn.running_mean", "cost_reg_0.conv0.bn.running_var", "cost_reg_0.conv0.bn.num_batches_tracked", "cost_reg_0.conv1.conv.weight", "cost_reg_0.conv1.bn.weight", "cost_reg_0.conv1.bn.bias", "cost_reg_0.conv1.bn.running_mean", "cost_reg_0.conv1.bn.running_var", "cost_reg_0.conv1.bn.num_batches_tracked", "cost_reg_0.conv2.conv.weight", "cost_reg_0.conv2.bn.weight", "cost_reg_0.conv2.bn.bias", "cost_reg_0.conv2.bn.running_mean", "cost_reg_0.conv2.bn.running_var", "cost_reg_0.conv2.bn.num_batches_tracked", "cost_reg_0.conv3.conv.weight", "cost_reg_0.conv3.bn.weight", "cost_reg_0.conv3.bn.bias", "cost_reg_0.conv3.bn.running_mean", "cost_reg_0.conv3.bn.running_var", "cost_reg_0.conv3.bn.num_batches_tracked", "cost_reg_0.conv4.conv.weight", "cost_reg_0.conv4.bn.weight", "cost_reg_0.conv4.bn.bias", "cost_reg_0.conv4.bn.running_mean", "cost_reg_0.conv4.bn.running_var", "cost_reg_0.conv4.bn.num_batches_tracked", "cost_reg_0.conv9.0.weight", "cost_reg_0.conv9.1.weight", "cost_reg_0.conv9.1.bias", "cost_reg_0.conv9.1.running_mean", "cost_reg_0.conv9.1.running_var", "cost_reg_0.conv9.1.num_batches_tracked", "cost_reg_0.conv11.0.weight", "cost_reg_0.conv11.1.weight", "cost_reg_0.conv11.1.bias", "cost_reg_0.conv11.1.running_mean", "cost_reg_0.conv11.1.running_var", "cost_reg_0.conv11.1.num_batches_tracked", "cost_reg_0.depth_conv.0.weight", "cost_reg_0.feat_conv.0.weight", "nerf_0.agg.view_fc.0.weight", "nerf_0.agg.view_fc.0.bias", "nerf_0.agg.global_fc.0.weight", "nerf_0.agg.global_fc.0.bias", "nerf_0.agg.agg_w_fc.0.weight", "nerf_0.agg.agg_w_fc.0.bias", "nerf_0.agg.fc.0.weight", "nerf_0.agg.fc.0.bias", "nerf_0.lr0.0.weight", "nerf_0.lr0.0.bias", "nerf_0.sigma.0.weight", "nerf_0.sigma.0.bias", "nerf_0.color.0.weight", "nerf_0.color.0.bias", "nerf_0.color.2.weight", "nerf_0.color.2.bias", "cost_reg_1.conv0.conv.weight", "cost_reg_1.conv0.bn.weight", "cost_reg_1.conv0.bn.bias", "cost_reg_1.conv0.bn.running_mean", "cost_reg_1.conv0.bn.running_var", "cost_reg_1.conv0.bn.num_batches_tracked", "cost_reg_1.conv1.conv.weight", "cost_reg_1.conv1.bn.weight", "cost_reg_1.conv1.bn.bias", "cost_reg_1.conv1.bn.running_mean", "cost_reg_1.conv1.bn.running_var", "cost_reg_1.conv1.bn.num_batches_tracked", "cost_reg_1.conv2.conv.weight", "cost_reg_1.conv2.bn.weight", "cost_reg_1.conv2.bn.bias", "cost_reg_1.conv2.bn.running_mean", "cost_reg_1.conv2.bn.running_var", "cost_reg_1.conv2.bn.num_batches_tracked", "cost_reg_1.conv3.conv.weight", "cost_reg_1.conv3.bn.weight", "cost_reg_1.conv3.bn.bias", "cost_reg_1.conv3.bn.running_mean", "cost_reg_1.conv3.bn.running_var", "cost_reg_1.conv3.bn.num_batches_tracked", "cost_reg_1.conv4.conv.weight", "cost_reg_1.conv4.bn.weight", "cost_reg_1.conv4.bn.bias", "cost_reg_1.conv4.bn.running_mean", "cost_reg_1.conv4.bn.running_var", "cost_reg_1.conv4.bn.num_batches_tracked", "cost_reg_1.conv5.conv.weight", "cost_reg_1.conv5.bn.weight", "cost_reg_1.conv5.bn.bias", "cost_reg_1.conv5.bn.running_mean", "cost_reg_1.conv5.bn.running_var", "cost_reg_1.conv5.bn.num_batches_tracked", "cost_reg_1.conv6.conv.weight", "cost_reg_1.conv6.bn.weight", "cost_reg_1.conv6.bn.bias", "cost_reg_1.conv6.bn.running_mean", "cost_reg_1.conv6.bn.running_var", "cost_reg_1.conv6.bn.num_batches_tracked", "cost_reg_1.conv7.0.weight", "cost_reg_1.conv7.1.weight", "cost_reg_1.conv7.1.bias", "cost_reg_1.conv7.1.running_mean", "cost_reg_1.conv7.1.running_var", "cost_reg_1.conv7.1.num_batches_tracked", "cost_reg_1.conv9.0.weight", "cost_reg_1.conv9.1.weight", "cost_reg_1.conv9.1.bias", "cost_reg_1.conv9.1.running_mean", "cost_reg_1.conv9.1.running_var", "cost_reg_1.conv9.1.num_batches_tracked", "cost_reg_1.conv11.0.weight", "cost_reg_1.conv11.1.weight", "cost_reg_1.conv11.1.bias", "cost_reg_1.conv11.1.running_mean", "cost_reg_1.conv11.1.running_var", "cost_reg_1.conv11.1.num_batches_tracked", "cost_reg_1.depth_conv.0.weight", "cost_reg_1.feat_conv.0.weight", "nerf_1.agg.view_fc.0.weight", "nerf_1.agg.view_fc.0.bias", "nerf_1.agg.global_fc.0.weight", "nerf_1.agg.global_fc.0.bias", "nerf_1.agg.agg_w_fc.0.weight", "nerf_1.agg.agg_w_fc.0.bias", "nerf_1.agg.fc.0.weight", "nerf_1.agg.fc.0.bias", "nerf_1.lr0.0.weight", "nerf_1.lr0.0.bias", "nerf_1.sigma.0.weight", "nerf_1.sigma.0.bias", "nerf_1.color.0.weight", "nerf_1.color.0.bias", "nerf_1.color.2.weight", "nerf_1.color.2.bias".

NaN in training

Hi, when I trained my own dataset, error occurred as below:
image
I set 'shuffle' to False to check if some particular images in my dataset cause this error, but it still occurs randomly (mostly in the first epoch, but it occurred once in the second epoch while the first epoch seem good).
Do you have any idea? Thank you for your help!

Issues on evaluation

Hello! Thanks for sharing the code!

When I eval this model on LLFF dataset with test part (unseen scenes in theory), I found it perform much better in 'fortress' scene than other methods such as IBRNet, but other scenes have similar performances. So I would like to ask whether the 'latest.pth' you released is a model that has not been finetuned?

In addition, I want to know if I want to get an image whose image size does not fit the cost volume network (such as the original image size of the LLFF dataset), how should I do it?(matbe resize the size of the depth and the original image?)

I'm looking forward to your reply.

Config file for fine-tune zjumocap dataset

Hi,

Thanks for sharing this nice work.

I would like know that how to fine-tune model for zjumocap dataset.

I have modified the config from zjumocap_eval.yaml, but the results are worse than the pre-trained model.

Do you have suggestions ?

Thanks !!!

Is there a GUI in this project?

Hi, thanks for your hard work. I'm trying to run ENeRF evaluate and it can work normally. I'm curious whether this project has a GUI to display the reconstruction model. Could you please tell me how to execute?

How to train on my own llff data?

Thanks for your great work! I found that the README only writes how to train on the DTU dataset, but the Depth_raw is difficult to get if I try to train my own llff data. What should I do? I would appreciate it if you could reply to me.

near_far插值范围限定

您好,目前代码利用了拟合的单人SMPL模型6890各顶点得到了3Dbbox,用来限定cost volume插值的深度(near_far),请问像project page里展示的多人情况和带背景的情况,是不用这个方法进行处理吗?

Properly formatted annots.npy file

May I ask how to obtain the annots.json and the new annots.npy? I want to finetune the model using my own data set similar to zju-mocap, but I found that the annots.npy file format of the two is different, which makes it unreadable. How do I get these files to work with?
image

Camera Color Calibration

Thanks for your contribution, how was the camera color calibrated when capturing the dataset used in your project? Need to do some manual settings on the camera?

Question about quantitative evaluation on pretrained model

I used the provided generalization model to perform evaluation on DTU dataset as in readme, and got the psnr, ssim and lpips values:
image

whereas in readme, the quantitative results should be
image

I wonder why the quantitative evaluation results are different? And I want to know your evaluation results? Thanks

KeyError: 'rgb_level0'

I retrain with zjumocap dataset with command: python train_net.py --cfg_file configs/enerf/zjumocap_eval.yaml
while I get the error:
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /opt/conda/lib/python3.8/site-packages/lpips/weights/v0.1/vgg.pth
Traceback (most recent call last):
File "train_net.py", line 117, in
main()
File "train_net.py", line 109, in main
train(cfg, network)
File "train_net.py", line 51, in train
trainer.train(epoch, train_loader, optimizer, recorder)
File "/dfs/data/ENeRF/lib/train/trainers/trainer.py", line 56, in train
output, loss, loss_stats, image_stats = self.network(batch)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1015, in call_impl
return forward_call(*input, **kwargs)
File "lib/train/losses/enerf.py", line 23, in forward
color_loss = self.color_crit(batch[f'rgb
{i}'], output[f'rgb_level{i}'])
KeyError: 'rgb_level0'

About video on the website

Hi Haotong and Sida,

awesome work! I believe many are as impressed as I am. My question is about the experimental setting for the video on your website since it's not mentioned in the main paper. I wonder:
(1) how many cameras are you using?
(2) what are the training and testing splits? e.g. are testing done on completely new videos? Are the training data any similar to the test videos? etc.
(3) Are these generated with finetuning?

Thank you very much for your awesome work!

run gui_human.py on my own dataset. Problem with visualization

I built my own similar zju-mocap dataset. The size of the image and mask I input is 1088 x 1920, and the input ratio is 0.5, but I reported an error during visualization:
image

I noticed that the length of the output is 786432=512 x 512 x 3, which happens to be the output of the set zju-mocap dataset 1024 x 1024 x 3 under the corresponding input_ratio. What should I do to change the output to the correct output?
ps: because I found that if forced to use 512 x 512 x 3 as the output of reshape, the visualization will be incomplete:
pred_img = output[f'rgb_level{i}'][b].reshape(512, 512, 3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.