Giter Site home page Giter Site logo

yashbhalgat / hashnerf-pytorch Goto Github PK

View Code? Open in Web Editor NEW
956.0 16.0 99.0 36.96 MB

Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives: https://nvlabs.github.io/instant-ngp/

License: MIT License

Python 98.98% Shell 1.02%
nerf real-time-rendering computer-graphics computer-vision neural-network signed-distance-functions artificial-intelligence machine-learning 3d-reconstruction efficient-training

hashnerf-pytorch's Introduction

HashNeRF-pytorch

🌟 Update 🌟

Get answers to any questions about this repository using this HuggingFace Chatbot.


Instant-NGP recently introduced a Multi-resolution Hash Encoding for neural graphics primitives like NeRFs. The original NVIDIA implementation mainly in C++/CUDA, based on tiny-cuda-nn, can train NeRFs upto 100x faster!

This project is a pure PyTorch implementation of Instant-NGP, built with the purpose of enabling AI Researchers to play around and innovate further upon this method.

This project is built on top of the super-useful NeRF-pytorch implementation.

Convergence speed w.r.t. Vanilla NeRF

HashNeRF-pytorch (left) vs NeRF-pytorch (right):

Chair.Convergence.mp4

After training for just 5k iterations (~10 minutes on a single 1050Ti), you start seeing a crisp chair rendering. :)

Instructions

Download the nerf-synthetic dataset from here: Google Drive.

To train a chair HashNeRF model:

python run_nerf.py --config configs/chair.txt --finest_res 512 --log2_hashmap_size 19 --lrate 0.01 --lrate_decay 10

To train for other objects like ficus/hotdog, replace configs/chair.txt with configs/{object}.txt:

hotdog_ficus

Extras

The code-base has additional support for:

  • Total Variation Loss for smoother embeddings (use --tv-loss-weight to enable)
  • Sparsity-inducing loss on the ray weights (use --sparse-loss-weight to enable)

ScanNet dataset support

The repo now supports training a NeRF model on a scene from the ScanNet dataset. I personally found setting up the ScanNet dataset to be a bit tricky. Please find some instructions/notes in ScanNet.md.

TODO:

  • Voxel pruning during training and/or inference
  • Accelerated ray tracing, early ray termination

Citation

Kudos to Thomas Müller and the NVIDIA team for this amazing work, that will greatly help accelerate Neural Graphics research:

@article{mueller2022instant,
    title = {Instant Neural Graphics Primitives with a Multiresolution Hash Encoding},
    author = {Thomas M\"uller and Alex Evans and Christoph Schied and Alexander Keller},
    journal = {arXiv:2201.05989},
    year = {2022},
    month = jan
}

Also, thanks to Yen-Chen Lin for the super-useful NeRF-pytorch:

@misc{lin2020nerfpytorch,
  title={NeRF-pytorch},
  author={Yen-Chen, Lin},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished={\url{https://github.com/yenchenlin/nerf-pytorch/}},
  year={2020}
}

If you find this project useful, please consider to cite:

@misc{bhalgat2022hashnerfpytorch,
  title={HashNeRF-pytorch},
  author={Yash Bhalgat},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished={\url{https://github.com/yashbhalgat/HashNeRF-pytorch/}},
  year={2022}
}

Star History

Star History Chart

hashnerf-pytorch's People

Contributors

ludubies avatar yashbhalgat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hashnerf-pytorch's Issues

Tiny-cuda-nn bindings

Hello, thank you for releasing this really usefull code! I have been wondering if you and your team have done any test (or think of any integration) with the tiny-cuda-nn bindings already released by NVIDIA. And if, maybe, it could speed-up performance.

About the SHEncoder

Hi, thanks for the sharing of this code.
I'm confused about the SHEncoder, is there any reference on that?
How to choose the given C0 to C4?
Thank you :-)

TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor' when runnerf on real images

hi, hashNerf works well on synthetic images, however, it doesn't work on nerf_liff_data, such as fern.
I run
python3 run_nerf.py --config configs/fern.txt --finest_res 512 --log2_hashmap_size 19 --lrate 0.01 --lrate_decay 10
it do can train. but when it supposed to render, i guess, there's an error.
TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor'

Uploading image.png…

and line 169 is
Uploading image.png…

did anybody meet the same error, help me
thanks!!! orz

How does the trilinear interpolation work if the dimensions don't match?

Hey, i am stuck with this error at the function: def trilinear_interp(self, x, voxel_min_vertex, voxel_max_vertex, voxel_embedds) in class HashEmbedder(nn.Module):

RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 2
Emerging at the following line:

c00 = voxel_embedds[:,0]*(1-weights[:,0][:,None]) + voxel_embedds[:,4]*weights[:,0][:,None]

My input was a 3D coordinate, [X, Y, Z], and I do not see where I can reshape or change, because its the voxel embedding that is the problem:

Considering an input with 3 dimensions, and the hash embedding having 2 dimensions, how does this even work? The following is the specs in the function itself:

  • x: B x 3
  • voxel_min_vertex: B x 3
  • voxel_max_vertex: B x 3
  • voxel_embedds: B x 8 x 2
  • weights = (x - voxel_min_vertex)/(voxel_max_vertex-voxel_min_vertex) # B x 3

Please help!

Why not torch.optim.RAdam

Thanks for the sharing of this work.
I've found the RAdam optimizer is self-implemented, what's the difference between that and the official API?
Appreciate :-)

Question about hash encoding speed

Hello, thank you for your code. I have some questions regarding hash encoding. As I understand it, I only need to replace the positional encoding and large MLP part of my model with hash encoding and a small MLP to achieve faster training. However, in practice, in my model, it takes 4 seconds to iterate 20 iterations with positional encoding and large MLP, while it takes 9 seconds with hash encoding and small MLP.
Does this mean that using hash encoding and small MLP actually requires longer training time for each iteration? And why can hash encoding accelerate the training of nerf?

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Traceback (most recent call last):
File "run_nerf.py", line 992, in
train()
File "run_nerf.py", line 660, in train
images, poses, render_poses, hwf, i_split, bounding_box = load_blender_data(args.datadir, args.half_res, args.testskip)
File "/home/shuow/HashNeRF-pytorch/load_blender.py", line 89, in load_blender_data
bounding_box = get_bbox3d_for_blenderobj(metas["train"], H, W, near=2.0, far=6.0)
File "/home/shuow/HashNeRF-pytorch/utils.py", line 41, in get_bbox3d_for_blenderobj
rays_o, rays_d = get_rays(directions, c2w)
File "/home/shuow/HashNeRF-pytorch/ray_utils.py", line 46, in get_rays
rays_d = directions @ c2w[:3, :3].T # (H, W, 3)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

I tried sending directions and c2w to cuda, but still got an error. why found two devices?

sparsity_loss = entropy

When I run this code with the synthetic Lego dataset, it works fine. But when I run it with the llff dataset, I encounter the following issue:
python run_nerf.py --config configs/fren.txt --finest_res 512 --log2_hashmap_size 19 --lrate 0.01 --lrate_decay 10 0 0.0010004043579101562 [00:00<?, ?it/s] [1/1]
[99%]
c:\users\nezo\desktop\3d\hashnerf-pytorch\run_nerf.py(379)raw2outputs() ->
sparsity_loss = entropy
Could you please tell me the reason for this issue?

3 questions about precrop_iters

Hi, great work, i love it.

I have 3 questions about precrop_iters, which enables center crop during few iterations from beginning. I wonder :
1、What's the role of crop during training?
2、Is it a specified trick for blender (nerf_synthetic dataset) ?
3、Had this kind of trick been used by other implementations ? like instant-ngp ?

Thanks for your reply.

SDF experiments of InstantNGP

I am adding a tracking issue for porting over the SDF experiments from InstantNGP.

In case there is any interest or people have tried to implement it.

Training on real bigger scene

I trained this model on the classroom dataset. It has perfect input view reconstructions but zero generalization. so moving the camera even a tiny step will break the novel view reconstruction. This scene is of size about 10x10x8 meters and the images are omnidirectional. The size of the scene is about 4 times bigger than the synthetic datasets (e.g. the Lego scene) that are used here. Is there something to consider when training on bigger scenes, or is just this model not capable of handing bigger scenes?

Sparsity regulariser not implemented correctly?

Here you use the weights of the ray to compute entropy. These weights however are un-normalized, so the calculation of entropy isn't right. I think that either normalizing the weights, or just passing them as logits rather than probs should be correct.

about tv loss

new learner here. 2 questions:

  1. what is the suitable weight for tv loss ? (default is 1e-6, I think it is too small)
  2. dose official instant-ngp use tv loss?

Instant NGP

Hello, thank for the work.
You have mentioned in the paper that this work is orthogonal to works that leverage hybrid or explicit grid-based representations for view synthesis. (Like instant ngp I believe). Did you made some experiments to combine with their codebase? Does it make sense to go in this direction?

Another PyTorch only NGP implementation

Hey there,

I've recently come across your work on NGP via an issue at the official instant-ngp. Great work!

I am working on a very similar project. I have added acceleration structures and dynamic background training in spirit (but not exactly the same) of instant-ngp, however my PSNR results are constantly below the values reported by ngp (on synthetic data: 27dB vs 35dB). Did you experience similar issues?

Here's my repo in case you want to browse it:
https://github.com/cheind/pure-torch-ngp

Accelerated raymarching & pruning

Hi, thanks for this pure PyTorch implementation of hashgrid nerf. It helps in figuring out details at PyTorch level.

I noticed there is a todo list for faster raymarching and voxel pruning. I'm wondering if there is a plan for implementing that? Or is there any suggestions of implementing them at PyTorch level. Thanks!

About the size of the hash table

I observed that the size of the hash table for each resolution level is fixed at 2^19, but when the resolution is 512, the number of voxels should be 512^3, which has exceeded the hash table of this level size, may I ask how so many voxels are put into the hash table?

pretrained embedding

Is there a trained embedding so that we can get the embedding without retrain the parameters? We can do this by using nvidia instant-npg api

TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor' when runnerf on real images

hi, hashNerf works well on synthetic images, however, it doesn't work on nerf_liff_data, such as fern.
I run
python3 run_nerf.py --config configs/fern.txt --finest_res 512 --log2_hashmap_size 19 --lrate 0.01 --lrate_decay 10
it do can train. but when it supposed to render, i guess, there's an error.
TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor'

Uploading image.png…

and line 169 is
Uploading image.png…

did anybody meet the same error, help me
thanks!!! orz

Undefined 'disps'

disps = np.stack(disps, 0)
if gt_imgs is not None and render_factor==0:
avg_psnr = sum(psnrs)/len(psnrs)
print("Avg PSNR over Test set: ", avg_psnr)
with open(os.path.join(savedir, "test_psnrs_avg{:0.2f}.pkl".format(avg_psnr)), "wb") as fp:
pickle.dump(psnrs, fp)
return rgbs, disps

Variable disps is undefined here, do you mean depths? I tried to replace disps with depths and run python run_nerf.py --config configs/chair.txt --finest_res 512 --log2_hashmap_size 19 --lrate 0.01 --lrate_decay 10, but got exception at the entropy calculation:

try:
entropy = Categorical(probs = torch.cat([weights, 1.0-weights.sum(-1, keepdim=True)+1e-6], dim=-1)).entropy()
except:
pdb.set_trace()

Any ideas on this issue? Thanks in advance.

some errors in code

File "/data/ytr/conda/envs/direct/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(input, **kwargs)
File "/data/ytr/HashNeRF-pytorch/hash_encoding.py", line 63, in forward
voxel_min_vertex, voxel_max_vertex, hashed_voxel_indices = get_voxel_vertices(
File "/data/ytr/HashNeRF-pytorch/utils.py", line 83, in get_voxel_vertices
hashed_voxel_indices = hash(voxel_indices, log2_hashmap_size)
File "/data/ytr/HashNeRF-pytorch/utils.py", line 18, in hash
return ((1<<log2_hashmap_size)-1) & (x
73856093 ^ y19349663 ^ z83492791)
TypeError: unsupported operand type(s) for &: 'int' and 'Tensor'

Can you tell me how to solve it?

training time

So how long will it take during training. Thanks!

"coarse and fine network"?

Hi,

I have take a look at your code and the original instant-ngp code, it seems there is a difference about coarse and fine network.
I'm not sure if my understanding is correct, so I want to ask and make sure.

The corse and fine network structures in the original nerf do not seem to be used in the instant-ngp. The instant-ngp uses "Occupancy Grids" (Appendix E.2) to sample points to avoid sampling points at empty positions.
Your code here is handled differently from instant-ngp, is this understanding correct?

Cheers,
Light

Training is very slow.

After training for just 5k iterations (~10 minutes on a single 1050Ti), you start seeing a crisp chair rendering. :).But I trained 5000 times on Tesla M60 6G, which took about an hour in total.Can you help me?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.