Giter Site home page Giter Site logo

ayushtewari / dfm Goto Github PK

View Code? Open in Web Editor NEW
149.0 3.0 18.0 11.16 MB

Implementation of "Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision"

Home Page: https://diffusion-with-forward-models.github.io/

Python 100.00%

dfm's Introduction

Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision

Project Page | Paper

Abstract

Denoising diffusion models have emerged as a powerful class of generative models capable of capturing the distributions of complex, real-world signals. However, current approaches can only model distributions for which training samples are directly accessible, which is not the case in many real-world tasks. In inverse graphics, for instance, we seek to sample from a distribution over 3D scenes consistent with an image but do not have access to ground-truth 3D scenes, only 2D images. We present a new class of conditional denoising diffusion probabilistic models that learn to sample from distributions of signals that are never observed directly, but instead are only measured through a known differentiable forward model that generates partial observations of the unknown signal. To accomplish this, we directly integrate the forward model into the denoising process. At test time, our approach enables us to sample from the distribution over underlying signals consistent with some partial observation. We demonstrate the efficacy of our approach on three challenging computer vision tasks. For instance, in inverse graphics, we demonstrate that our model in combination with a 3D-structured conditioning method enables us to directly sample from the distribution of 3D scenes consistent with a single 2D input image.

Usage

Environment Setup

conda create -n dfm python=3.9 -y 
conda activate dfm 
pip install torch==2.0.1 torchvision
conda install -y -c fvcore -c iopath -c conda-forge fvcore iopath 
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py39_cu117_pyt201/download.html
pip install -r requirements.txt
python setup.py develop

Pretrained Models

You can download the pretrained mode from here and place it in the files folder.

Prepare CO3D Dataset

python data_io/co3d_new.py --generate_info_file  --generate_camera_quality_file --generate_per_scene_scale --dataset_root CO3D_ROOT 

The scene scale calculation can take a few hours. You can also download our precomputed statistics from here, and skip this flag during dataset preparation.

CO3D Inference

# hydrant one shot (faster, used for metric comparison)
python experiment_scripts/co3d_results.py dataset=CO3D name=co3d_oneshot_debug_new_branch ngpus=1 feats_cond=True wandb=online checkpoint_path=files/co3d_model.pt   use_abs_pose=True sampling_type=oneshot use_dataset_pose=True image_size=128

# hydrant 5-step  (slower, used for visualization)
python experiment_scripts/co3d_results.py dataset=CO3D name=co3d_autoregressive_5step ngpus=1 feats_cond=True wandb=online checkpoint_path=files/co3d_model.pt  use_abs_pose=True sampling_type=autoregressive use_dataset_pose=True  all_class=True test_autoregressive_stepsize=41 image_size=128

CO3D Training

# first train two-view pixelnerf  
torchrun  --nnodes 1 --nproc_per_node 8   experiment_scripts/train_pixelnerf.py dataset=CO3D name=pn_2ctxt  num_context=2 num_target=2 lr=2e-5 batch_size=16  wandb=online use_abs_pose=true scale_aug_ratio=0.2

# train at 64 resolution
torchrun  --nnodes 1 --nproc_per_node 8 experiment_scripts/train_3D_diffusion.py use_abs_pose=True dataset=CO3D lr=2e-5 ngpus=8 setting_name=co3d_3ctxt feats_cond=True wandb=online dataset.lpips_loss_weight=0.2 name=co3d scale_aug_ratio=0.2 load_pn=True checkpoint_path=PN_PATH

# finetune model at 128 resolution 
torchrun  --nnodes 1 --nproc_per_node 8 experiment_scripts/train_3D_diffusion.py use_abs_pose=True dataset=CO3D lr=2e-5 ngpus=8 setting_name=co3d_3ctxt feats_cond=True wandb=online dataset.lpips_loss_weight=0.2 name=co3d_128res scale_aug_ratio=0.2 checkpoint_path=CKPT_64  image_size=128

Prepare RealEstate10k Dataset

Download the dataset following the instructions here.

RealEstate10k Inference

python experiment_scripts/re_results.py dataset=realestate batch_size=1 num_target=1 num_context=1 model_type=dit feats_cond=true sampling_type=simple max_scenes=10000 stage=test use_guidance=true guidance_scale=2.0 temperature=0.85 sampling_steps=50 name=re10k_inference image_size=128 checkpoint_path=files/re10k_model.pt wandb=online

RealEstate10k Training

# train at 64 resolution
torchrun  --nnodes 1 --nproc_per_node 8 experiment_scripts/train_3D_diffusion.py dataset=realestate setting_name=re name=re10k mode=cond feats_cond=true wandb=online ngpus=8 use_guidance=true image_size=64

# finetune at 128 resolution 
torchrun  --nnodes 1 --nproc_per_node 8 experiment_scripts/train_3D_diffusion.py dataset=realestate setting_name=re_128res name=re10k mode=cond feats_cond=true wandb=online ngpus=8 use_guidance=true checkpoint_path=TBA image_size=128

Logging

We use wandb for logging. Enter the relevant information in configurations/wandb/online.yaml to use this feature. Logging can be disabled by setting wandb=local.

Citation

If you find our work useful in your research, please cite:

@article{tewari2023diffusion,
      title={Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision}, 
      author={Ayush Tewari and Tianwei Yin and George Cazenavette and Semon Rezchikov and Joshua B. Tenenbaum and Frédo Durand and William T. Freeman and Vincent Sitzmann},
      year={2023},
      journal={NeurIPS}
}

dfm's People

Contributors

tianweiy avatar ayush-tewari avatar

Stargazers

 avatar Brian Lee avatar Mingjia Li avatar Jiageng Mao avatar Jiadong Tang avatar ghzhao avatar Junhyeong Cho avatar song avatar  avatar zhengjia avatar SpyderZSY avatar Michal Starý avatar  avatar  avatar Hanyuan Xiao avatar  avatar Xiang Mochu avatar  avatar  avatar Kim Youwang avatar Comar avatar  avatar  avatar Shuo avatar Pan Xie avatar Ethan avatar Edi Rumano avatar zephyr avatar StephenHe avatar Griffin Seonho Lee avatar Drewvv avatar  avatar killer9 avatar Junwu Zhang avatar Chaojie Ji avatar  avatar Tianao Li avatar Peter Kadlot avatar zhanghe avatar  avatar Ze-Xin Yin avatar  avatar Nuri Ryu avatar  avatar NIRVANA avatar  avatar Baoxiong Jia avatar Dave Z. Chen  avatar Yongsen Mao avatar Lnyan avatar Edwina Seychelles avatar Alakia avatar Jingnan Shi avatar Jingsen Zhu avatar Rosalee Norton avatar Hu Zhu avatar chrisLi avatar Chunxu Guo avatar  avatar Marshall_G avatar Leo avatar Yuki Kondo avatar Martin Ethier avatar Zhihan Yang avatar shelton avatar John Casey avatar Stefano Esposito avatar Sachin Chanchani avatar Zhao (Dylan) Wang avatar Jiale Xu avatar Anna Min avatar Ulysses 500 avatar Lau Van Kiet avatar Zhang Qihang avatar Lukas Hoellein avatar Qingyuan Shan avatar Leo avatar Pascal Troxler avatar Xz259 avatar Luca Trautmann avatar Yong Liu avatar Po-Chen Ko avatar Lukas Lao Beyer avatar Kayen avatar learner avatar  avatar Adam Erickson avatar Derrick avatar  avatar Guan Luo avatar Yuanchen Guo avatar Zheng Chen avatar  avatar Donald Chen avatar Pouria Mistani avatar Chuanxia Zheng avatar Tong He avatar ShuaiLi avatar noname avatar Florin Shen avatar

Watchers

 avatar Jiaming Sun avatar Ayush Tewari avatar

dfm's Issues

Sparsefusion baseline

Hi,

Did you happen to retrain the Sparsefusion pipeline on the Co3D dataset with background included (without masks)? If so, could you please release those weights (before CVPR)?

Thanks!
Tarasha

Simple test with one picture

Can we run a simple test with a single image on the RE10K pt model without pose data?
I would like to test the basic capabilities first.

Depth visualization looks strange

Hi,

Thanks for your excellent work.

I run the inference code on co3d hydrant following the instruction. The generated views look correct but the depth seems strange. Is that reasonable?

image

Empty sampled points

Hi,

Thanks for your excellent work. I tried to run the inference code on co3d hydrant. However, the following issue occured:

model dit
NOT LOADING DIT WEIGHTS
feats_cond True
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
/home/user/miniconda3/envs/nr/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
  warnings.warn(
/home/user/miniconda3/envs/nr/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Loading model from: /home/user/miniconda3/envs/nr/lib/python3.9/site-packages/lpips/weights/v0.1/vgg.pth
batch size: 1
checkpoint path: files/co3d_model.pt
step optimizer not found
run dir: /data1/user/codes/DFM/wandb/run-20231019_184837-vnnpa2on/files
wandb: WARNING Symlinked 0 file into the W&B run directory, call wandb.save again to sync new files.
wandb: WARNING Symlinked 0 file into the W&B run directory, call wandb.save again to sync new files.
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /home/user/miniconda3/envs/nr/lib/python3.9/site-packages/lpips/weights/v0.1/vgg.pth
video_idx: 0, len: 1
Starting sample 0
Error executing job with overrides: ['dataset=CO3D', 'name=co3d_oneshot_debug_new_branch', 'ngpus=1', 'feats_cond=True', 'wandb=online', 'checkpoint_path=files/co3d_model.pt', 'use_abs_pose=True', 'sampling_type=oneshot', 'use_dataset_pose=True', 'image_size=128']
Traceback (most recent call last):
  File "/data1/user/codes/DFM/experiment_scripts/co3d_results.py", line 414, in train
    out = trainer.ema.ema_model.sample(batch_size=1, inp=inp)
  File "/home/user/miniconda3/envs/nr/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/data1/user/codes/DFM/denoising_diffusion_pytorch/denoising_diffusion_pytorch.py", line 576, in sample
    return sample_fn(
  File "/home/user/miniconda3/envs/nr/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/data1/user/codes/DFM/denoising_diffusion_pytorch/denoising_diffusion_pytorch.py", line 468, in ddim_sample
    ctxt_rgbd, trgt_rgbd, ctxt_feats = self.model.render_ctxt_from_trgt_cam(
  File "/data1/user/codes/DFM/PixelNeRF/pixelnerf_model_cond.py", line 259, in render_ctxt_from_trgt_cam
    rgb, depth, rendered_feats = self.render_full_in_patches(
  File "/data1/user/codes/DFM/PixelNeRF/pixelnerf_model_cond.py", line 187, in render_full_in_patches
    rgb, depth, misc = self.renderer_coarse(
  File "/home/user/miniconda3/envs/nr/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/data1/user/codes/DFM/PixelNeRF/renderer.py", line 301, in forward
    sigma = sigma.view(batch_size, num_rays, self.n_samples, 1)
RuntimeError: shape '[1, 1024, 64, 1]' is invalid for input of size 0

May be something wrong with the preprocessed data?

Data Loading

Can you explain the file distribution that is used by default in the real estate testing? By default the data is saved for the real estate under the root file: root/[test or train]/[subdirectory_name]/data.npz/[image_name.jpg.npy]. Is this also the structure that you used?

Data preparation is incomplete, kindly specify the complete file structure as in how the files are to be saved for correct loading.

missing pose file

hi,

Thanks for releasing this great work. I am working on reproducing the results on RealEstate10k dataset and met two problems:

Best,
Shengyu

Depths not consistent in RealEstate10K

Hello authors,

I am new to this and very much admire your work on the RealEstate10K, which I am using for my final project in architectural modeling, utilizing simple sampling. Naturally, depth is quite important for me. However, it seems that it is not consistent when I perform a point cloud of it in Open3D.

It remains almost perfectly consistent for the first few novel views, but as soon as it gets to the last viewpoint, it becomes quite different. Please kindly check out the images. I understand that this should not happen as it is supposed to recover the same 3D scene. I am certain of my implementation of point clouds as I use it for other works as well, and it cannot be the source of error.

Screenshot 2023-12-13 Screenshot 2023-12-13

An error in fine-tuning the model for resolution 128 when loading the parameters of resolution 64.

Hi! I encountered an error:

Traceback (most recent call last):
  File "/group/30042/ozhengchen/pano_aigc/DFM/experiment_scripts/train_3D_diffusion.py", line 86, in train
    trainer = Trainer(
  File "/group/30042/ozhengchen/pano_aigc/DFM/denoising_diffusion_pytorch/denoising_diffusion_pytorch.py", line 1028, in __init__
    self.load(checkpoint_path)
  File "/group/30042/ozhengchen/pano_aigc/DFM/denoising_diffusion_pytorch/denoising_diffusion_pytorch.py", line 1106, in load
    model.load_state_dict(data["model"], strict=True)
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(RuntimeError: Error(s) in loading state_dict for GaussianDiffusion:
        size mismatch for model.enc.pos_embed: copying a param with shape torch.Size([1, 256, 1152]) from checkpoint, the shape in current model is torch.Size([1, 1024, 1152]).

Do you have any idea about how to fix it?

CNN_Refine not defined in DiT

Hi, I am facing an issue with running the DFM code. It says that the DiT model seems not to have the cnn_refine defined. What is the workaround? Has anyone else faced this issue?

Input:

(dfm) Singularity> python experiment_scripts/re_results.py dataset=realestate batch_size=1 num_target=1 num_context=1 model_type=dit feats_cond=true sampling_type=simple max_scenes=10000 stage=test use_guidance=true guidance_scale=2.0 temperature=0.85 sampling_steps=50 name=re10k_inference image_size=128 checkpoint_path=files/re10k_model.pt wandb=local

Output:

...
model dit
Error executing job with overrides: ['dataset=realestate', 'batch_size=1', 'num_target=1', 'num_context=1', 'model_type=dit', 'feats_cond=true', 'sampling_type=simple', 'max_scenes=10000', 'stage=test', 'use_guidance=true', 'guidance_scale=2.0', 'temperature=0.85', 'sampling_steps=50', 'name=re10k_inference', 'image_size=128', 'checkpoint_path=files/re10k_model.pt', 'wandb=local']
Traceback (most recent call last):
  File "/home/usr/DFM/experiment_scripts/re_results.py", line 77, in train
    model = PixelNeRFModelCond(
  File "/home/usr/EK/DFM/PixelNeRF/pixelnerf_model_cond.py", line 70, in __init__
    self.enc = DiT(
  File "/home/usr/DFM/PixelNeRF/transformer/DiT.py", line 271, in __init__
    self.initialize_weights()
  File "/home/usr/DFM/PixelNeRF/transformer/DiT.py", line 319, in initialize_weights
    self.cnn_refine.weight.data.fill_(0)
  File "/home/usr/.conda/envs/dfm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'DiT' object has no attribute 'cnn_refine'

preparing CO3D dataset

When I run python data_io/co3d_new.py --generate_info_file --generate_camera_quality_file --generate_per_scene_scale --dataset_root CO3D_ROOT ,

I get the following error:

Traceback (most recent call last):
  File "/home/jun/Documents/DFM/data_io/co3d_new.py", line 592, in <module>
    init_info_file(categories, args.dataset_root)
  File "/home/jun/Documents/DFM/data_io/co3d_new.py", line 381, in init_info_file
    dataset_map = get_dataset_map(dataset_root, category, 'fewview_dev')
  File "/home/jun/Documents/DFM/data_io/co3d_new.py", line 34, in get_dataset_map
    dataset_map_provider = JsonIndexDatasetMapProviderV2(
  File "<string>", line 15, in __init__
  File "/home/jun/Documents/DFM/data_io/co3d/json_index_dataset_map_provider_v2.py", line 214, in __post_init__
    dataset_map = self._load_category(self.category)
  File "/home/jun/Documents/DFM/data_io/co3d/json_index_dataset_map_provider_v2.py", line 234, in _load_category
    raise ValueError(
ValueError: Looking for frame annotations in CO3D_ROOT/hydrant/frame_annotations.jgz. Please specify a correct dataset_root folder. Note: By default the root folder is taken from the CO3DV2_DATASET_ROOT environment variable.

Any ideas??
Thanks!

Out of GPU Memory when runing train_3D_diffusion.py

Hi! DFM is a great work! I'm trying it for my research.

But when I ran the following command on 4 A100(40G) GPU cards, I got the out of GPU memory error.
I have revised the "batch_size" to 1 * ngpus in the get_train_settings function of train_3D_diffusion.py. This error still appears. Do you know how to fix it?

ngpus=4

torchrun  --nnodes 1 --nproc_per_node $ngpus experiment_scripts/train_3D_diffusion.py dataset=realestate setting_name=re name=re10k mode=cond feats_cond=true wandb=local ngpus=$ngpus use_guidance=true image_size=64

The log is:

......
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/hydra/_internal/utils.py", line 223, in run_and_report                          [37/1785]
    raise ex
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
    return func()
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/hydra/_internal/utils.py", line 458, in <lambda>
    lambda: hydra.run(
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 132, in run
    _ = ret.return_value
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/hydra/core/utils.py", line 260, in return_value
    raise self._return_value
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/hydra/core/utils.py", line 186, in run_job
    ret.return_value = task_function(task_cfg)
  File "/group/30042/ozhengchen/pano_aigc/DFM/experiment_scripts/train_3D_diffusion.py", line 109, in train
    trainer.train()
  File "/group/30042/ozhengchen/pano_aigc/DFM/denoising_diffusion_pytorch/denoising_diffusion_pytorch.py", line 1218, in train
    losses, misc = self.model(data, render_video=render_video)
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1156, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/parallel/distributed.py", line 1110, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])  # type: ignore[index]
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/group/30042/ozhengchen/pano_aigc/DFM/denoising_diffusion_pytorch/denoising_diffusion_pytorch.py", line 905, in forward
    return self.p_losses(inp, t, *args, **kwargs)
  File "/group/30042/ozhengchen/pano_aigc/DFM/denoising_diffusion_pytorch/denoising_diffusion_pytorch.py", line 722, in p_losses
    model_out, depth, misc = self.model(
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/group/30042/ozhengchen/pano_aigc/DFM/PixelNeRF/pixelnerf_model_cond.py", line 675, in forward
    rgbfeats, depth, misc = self.renderer(trgt_c2w, intrinsics, new_xy, rf)
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/group/30042/ozhengchen/pano_aigc/DFM/PixelNeRF/renderer.py", line 345, in forward
    sigma_all, feats_all, _ = radiance_field(pts_all, viewdirs_all, fine=True)
  File "/group/30042/ozhengchen/pano_aigc/DFM/PixelNeRF/pixelnerf_model_cond.py", line 722, in <lambda>
    return lambda x, v, fine: self.pixelNeRF_joint(
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/group/30042/ozhengchen/pano_aigc/DFM/PixelNeRF/pixelnerf_helpers.py", line 277, in forward
    mlp_output = self.mlp_fine(mlp_in, ns=num_context, time_emb=t)
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/group/30042/ozhengchen/pano_aigc/DFM/PixelNeRF/resnetfc_time_embed.py", line 246, in forward
    x = self.blocks[blkid](x, time_emb=time_emb)
  File "/group/30042/ozhengchen/ft_local/anaconda3/envs/dfm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/group/30042/ozhengchen/pano_aigc/DFM/PixelNeRF/resnetfc_time_embed.py", line 94, in forward
    return x_s + dx
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 288.00 MiB (GPU 0; 39.59 GiB total capacity; 36.42 GiB already allocated; 191.19 MiB free; 36.74 GiB reserved 
in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_AL
LOC_CONF

Pretrain pixel-nerf results in errors when loading the checkpoint

Hi,

I tried to follow the instructions and first train a pixel-nerf checkpoint and then finetune.
However, there are several issues when loading the state-dict for the second-stage training.

Some sources of error are:

  • feats_cond=False in first stage, but feats_cond=True in second stage. This results in different values of conf_feats_dim here:

    cond_feats_dim=(74 if self.self_condition else 71)

  • n_feats_out=0 in first stage, but n_feats_out=64 in second stage. This results in different shapes of the pixelNeRF mlp lin_out matrices.

  • I assume there needs to be an updated loading function, could you please check if you provided the correct loading function?

  • Otherwise, we could just start from second stage training directly. I wonder if it makes a huge difference?

classifier-free guidance

hi,

I noted that you used classifier-free guidance for RealEstate10K dataset. Do you have any ablation study on this design choice? How much does this contribute to the final performance? This is actually pretty expensive as it requires another volume rendering step for the unconditional generation part, so I am wondering if it's essential to get DFM to work for scenes or it is more about improving over a working baseline.

Best
shengyu

Just predicting noise

It seems like this model is not very robust to trjectory changes, and can only predict a few trajectories, is there any explanation to these completely losing any meaning of the world?

rgb_video_living_room.mp4

It seems that it's just predicting the next best frame in the sequence that it has been trained on in terms of the dataset trajectories and does not have the capability to take any other target pose. Any help?

Getting errors when using the CO3D model

_IncompatibleKeys(missing_keys=['model.enc.pos_embed'], unexpected_keys=[])
_IncompatibleKeys(missing_keys=['online_model.model.enc.pos_embed', 'ema_model.model.enc.pos_embed'], unexpected_keys=[])

3D Consistency of RealEstate10K

I have been trying to get a point cloud to check for the 3D consistency of the scenes developed in Real Estate sampling. It doesn't seem like they are 3D consistent for the experiments I performed. Could it be that there is some form of anisotropic scaling which might be happening to the pixel Nerf coordinate to world coordinate. Where could I be going wrong?

Note: I am trying alternative approaches.

RealEstate RGB frame results not reproducible

frame_depth_0000 frame_0000
The depth and RGB frames above are what I am getting for the realestate samples at nsamples=2.
Is this expected, should I wait for more samples? The out dictionary shows that the videos all have pixel values between 163-255. Is it over-normalising somewhere?

Either way pose computations are incorrect, there is however a caveat as I broke the dataloading because the loading is not working as posted, so I had to change the steps, but that shouldn't change the nature of the trajectory (ideally).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.