yashbhalgat / hashnerf-pytorch Goto Github PK

Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives: https://nvlabs.github.io/instant-ngp/

License: MIT License

Python 98.98% Shell 1.02%

nerf real-time-rendering computer-graphics computer-vision neural-network signed-distance-functions artificial-intelligence machine-learning 3d-reconstruction efficient-training

hashnerf-pytorch's Introduction

HashNeRF-pytorch

🌟 Update 🌟

Get answers to any questions about this repository using this HuggingFace Chatbot.

Instant-NGP recently introduced a Multi-resolution Hash Encoding for neural graphics primitives like NeRFs. The original NVIDIA implementation mainly in C++/CUDA, based on tiny-cuda-nn, can train NeRFs upto 100x faster!

This project is a pure PyTorch implementation of Instant-NGP, built with the purpose of enabling AI Researchers to play around and innovate further upon this method.

This project is built on top of the super-useful NeRF-pytorch implementation.

Convergence speed w.r.t. Vanilla NeRF

HashNeRF-pytorch (left) vs NeRF-pytorch (right):

Chair.Convergence.mp4

After training for just 5k iterations (~10 minutes on a single 1050Ti), you start seeing a crisp chair rendering. :)

Instructions

Download the nerf-synthetic dataset from here: Google Drive.

To train a chair HashNeRF model:

python run_nerf.py --config configs/chair.txt --finest_res 512 --log2_hashmap_size 19 --lrate 0.01 --lrate_decay 10

To train for other objects like ficus/hotdog, replace configs/chair.txt with configs/{object}.txt:

Extras

The code-base has additional support for:

Total Variation Loss for smoother embeddings (use --tv-loss-weight to enable)
Sparsity-inducing loss on the ray weights (use --sparse-loss-weight to enable)

ScanNet dataset support

The repo now supports training a NeRF model on a scene from the ScanNet dataset. I personally found setting up the ScanNet dataset to be a bit tricky. Please find some instructions/notes in ScanNet.md.

TODO:

Voxel pruning during training and/or inference
Accelerated ray tracing, early ray termination

Citation

Kudos to Thomas Müller and the NVIDIA team for this amazing work, that will greatly help accelerate Neural Graphics research:

@article{mueller2022instant,
    title = {Instant Neural Graphics Primitives with a Multiresolution Hash Encoding},
    author = {Thomas M\"uller and Alex Evans and Christoph Schied and Alexander Keller},
    journal = {arXiv:2201.05989},
    year = {2022},
    month = jan
}

Also, thanks to Yen-Chen Lin for the super-useful NeRF-pytorch:

@misc{lin2020nerfpytorch,
  title={NeRF-pytorch},
  author={Yen-Chen, Lin},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished={\url{https://github.com/yenchenlin/nerf-pytorch/}},
  year={2020}
}

If you find this project useful, please consider to cite:

@misc{bhalgat2022hashnerfpytorch,
  title={HashNeRF-pytorch},
  author={Yash Bhalgat},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished={\url{https://github.com/yashbhalgat/HashNeRF-pytorch/}},
  year={2022}
}

Star History

hashnerf-pytorch's People

Contributors

Stargazers

Watchers

Forkers

wx-b chintang421 wbjang iwuqing mrtornado24 baldrlector li195111 peterouzh sarthak268 mazzzystar swizad shankal17 jahaniam pw22-sbn-01 vuongfiveduong jasonlsc felizang 3a1b2c3 tedyhabtegebrial dandingbudanding danielgrittner dichotomies metavai yuzewang1998 thequantumfractal amanshenoy xingjici lucasalexsorensen stevenhan1991 visualvlad wcswcswcs gitshohoku 1ucky40nc3 kylinyee kaszanas fengry0316 zxisspider boduan1 sarafridov gatsby23 leosegre pauljanson002 pira998 sumukund cmpt-985-term-project pooncs qianqianwang68 ashmrz lixinhaiyk immortalcactus dfqytcom 41xu gg-big-org ajunlonglive jackzhousz martellz ludubies 12010486 melancholy828 alpe6825 babyblue26 godpgf jettisonthenet kovenyu ge35tay qrcat machineko mayo66 zelyuchenproject learntostick yashdeep97 zoonono derrick-xwp dustier xifeng205 agnsud jlyw1017 jeonhyeryung mlwise112358 mache102 sandyfish1989 koolo233 rajarshi-misra xyp8023 niki-amini-naieni ducanger dykim323 geliv felixpun daoguizhang layaout zhumingxu jxw-tmp lxh0319 davidpengiupui simonli12342 whuhxb

hashnerf-pytorch's Issues

Why is the rendering speed of this version more than 100 times slower than the official gtx1000, using 1660ti

It takes 200 seconds to render a 400x400 image, while the official GUI 800x800 takes only a few seconds，I know this may be due to Cuda, but it's hard to accept such a big gap

Would be nice to have a requirements.txt file to ensure no dependencies or critical packages are missed.

Training speed and training effect

I feel that the training speed is very slow, which is not as good as NERF, and the effect is not very good? What is the situation?

Tiny-cuda-nn bindings

Hello, thank you for releasing this really usefull code! I have been wondering if you and your team have done any test (or think of any integration) with the tiny-cuda-nn bindings already released by NVIDIA. And if, maybe, it could speed-up performance.

About the SHEncoder

Hi, thanks for the sharing of this code.
I'm confused about the SHEncoder, is there any reference on that?
How to choose the given C0 to C4?
Thank you :-)

How to train on our own data?

Hello, I would like to train on my own dataset. Is the repo able to do that?

TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor' when runnerf on real images

hi, hashNerf works well on synthetic images, however, it doesn't work on nerf_liff_data, such as fern.
I run
python3 run_nerf.py --config configs/fern.txt --finest_res 512 --log2_hashmap_size 19 --lrate 0.01 --lrate_decay 10
it do can train. but when it supposed to render, i guess, there's an error.
TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor'

and line 169 is

did anybody meet the same error, help me
thanks!!! orz

How does the trilinear interpolation work if the dimensions don't match?

Hey, i am stuck with this error at the function: def trilinear_interp(self, x, voxel_min_vertex, voxel_max_vertex, voxel_embedds) in class HashEmbedder(nn.Module):

RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 2
Emerging at the following line:

c00 = voxel_embedds[:,0]*(1-weights[:,0][:,None]) + voxel_embedds[:,4]*weights[:,0][:,None]

My input was a 3D coordinate, [X, Y, Z], and I do not see where I can reshape or change, because its the voxel embedding that is the problem:

Considering an input with 3 dimensions, and the hash embedding having 2 dimensions, how does this even work? The following is the specs in the function itself:

x: B x 3
voxel_min_vertex: B x 3
voxel_max_vertex: B x 3
voxel_embedds: B x 8 x 2
weights = (x - voxel_min_vertex)/(voxel_max_vertex-voxel_min_vertex) # B x 3

Please help!

Why not torch.optim.RAdam

Thanks for the sharing of this work.
I've found the RAdam optimizer is self-implemented, what's the difference between that and the official API?
Appreciate :-)

generate mesh?

Hi, do you have any plan to support exporting mesh?

Question about hash encoding speed

Hello, thank you for your code. I have some questions regarding hash encoding. As I understand it, I only need to replace the positional encoding and large MLP part of my model with hash encoding and a small MLP to achieve faster training. However, in practice, in my model, it takes 4 seconds to iterate 20 iterations with positional encoding and large MLP, while it takes 9 seconds with hash encoding and small MLP.
Does this mean that using hash encoding and small MLP actually requires longer training time for each iteration? And why can hash encoding accelerate the training of nerf?

The meaning behind the hash function params and their difference from paper?

Hi, I noticed the hashing is different from the paper, and the prime numbers are also different from the paper. Could you explain what's the meaning of your implementation and the design choice?
torch.tensor((1<<log2_hashmap_size)-1) & (x*73856093 ^ y*19349663 ^ z*83492791)

Trained weights for forward facing scene

Hello Yash,
Thank you very much for the Pytorch port. Could you please share the trained weight file for any of the forward facing datasets?

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Traceback (most recent call last):
File "run_nerf.py", line 992, in
train()
File "run_nerf.py", line 660, in train
images, poses, render_poses, hwf, i_split, bounding_box = load_blender_data(args.datadir, args.half_res, args.testskip)
File "/home/shuow/HashNeRF-pytorch/load_blender.py", line 89, in load_blender_data
bounding_box = get_bbox3d_for_blenderobj(metas["train"], H, W, near=2.0, far=6.0)
File "/home/shuow/HashNeRF-pytorch/utils.py", line 41, in get_bbox3d_for_blenderobj
rays_o, rays_d = get_rays(directions, c2w)
File "/home/shuow/HashNeRF-pytorch/ray_utils.py", line 46, in get_rays
rays_d = directions @ c2w[:3, :3].T # (H, W, 3)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

I tried sending directions and c2w to cuda, but still got an error. why found two devices?

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_mm)

Hi, I'm a beginner and encountered this error when running the code. Does anyone know how to resolve it?

sparsity_loss = entropy

When I run this code with the synthetic Lego dataset, it works fine. But when I run it with the llff dataset, I encounter the following issue:
python run_nerf.py --config configs/fren.txt --finest_res 512 --log2_hashmap_size 19 --lrate 0.01 --lrate_decay 10 0 0.0010004043579101562 [00:00<?, ?it/s] [1/1]
[99%]
c:\users\nezo\desktop\3d\hashnerf-pytorch\run_nerf.py(379)raw2outputs() ->
sparsity_loss = entropy
Could you please tell me the reason for this issue?

3 questions about precrop_iters

Hi, great work, i love it.

I have 3 questions about precrop_iters, which enables center crop during few iterations from beginning. I wonder :
1、What's the role of crop during training?
2、Is it a specified trick for blender (nerf_synthetic dataset) ?
3、Had this kind of trick been used by other implementations ? like instant-ngp ?

Thanks for your reply.

SDF experiments of InstantNGP

I am adding a tracking issue for porting over the SDF experiments from InstantNGP.

In case there is any interest or people have tried to implement it.

Performance gap on NERF synthetic dataset

As mentioned in another issue #2, using the default config like nerf-pytorch does not get comparable performance to Instant-NGP.

Training on real bigger scene

I trained this model on the classroom dataset. It has perfect input view reconstructions but zero generalization. so moving the camera even a tiny step will break the novel view reconstruction. This scene is of size about 10x10x8 meters and the images are omnidirectional. The size of the scene is about 4 times bigger than the synthetic datasets (e.g. the Lego scene) that are used here. Is there something to consider when training on bigger scenes, or is just this model not capable of handing bigger scenes?

Sparsity regulariser not implemented correctly?

Here you use the weights of the ray to compute entropy. These weights however are un-normalized, so the calculation of entropy isn't right. I think that either normalizing the weights, or just passing them as logits rather than probs should be correct.

about tv loss

new learner here. 2 questions:

what is the suitable weight for tv loss ? (default is 1e-6, I think it is too small)
dose official instant-ngp use tv loss?

Instant NGP

Hello, thank for the work.
You have mentioned in the paper that this work is orthogonal to works that leverage hybrid or explicit grid-based representations for view synthesis. (Like instant ngp I believe). Did you made some experiments to combine with their codebase? Does it make sense to go in this direction?

Another PyTorch only NGP implementation

Hey there,

I've recently come across your work on NGP via an issue at the official instant-ngp. Great work!

I am working on a very similar project. I have added acceleration structures and dynamic background training in spirit (but not exactly the same) of instant-ngp, however my PSNR results are constantly below the values reported by ngp (on synthetic data: 27dB vs 35dB). Did you experience similar issues?

Here's my repo in case you want to browse it:
https://github.com/cheind/pure-torch-ngp

Can we visualize the result during training?

For now there is only current training step.

Accelerated raymarching & pruning

Hi, thanks for this pure PyTorch implementation of hashgrid nerf. It helps in figuring out details at PyTorch level.

I noticed there is a todo list for faster raymarching and voxel pruning. I'm wondering if there is a plan for implementing that? Or is there any suggestions of implementing them at PyTorch level. Thanks!

About the size of the hash table

I observed that the size of the hash table for each resolution level is fixed at 2^19, but when the resolution is 512, the number of voxels should be 512^3, which has exceeded the hash table of this level size, may I ask how so many voxels are put into the hash table?

pretrained embedding

Is there a trained embedding so that we can get the embedding without retrain the parameters? We can do this by using nvidia instant-npg api

How I need to enter it when running? - print('Loaded blender', images.shape, render_poses.shape, hwf, args.datadir)

How I need to enter it when running?
print('Loaded blender', images.shape, render_poses.shape, hwf, args.datadir)

TypeError: unsupported operand type(s) for -: 'numpy.ndarray' and 'Tensor' when runnerf on real images

and line 169 is

did anybody meet the same error, help me
thanks!!! orz

Undefined 'disps'

HashNeRF-pytorch/run_nerf.py

Lines 200 to 207 in 085ae25

    
           disps = np.stack(disps, 0) 
        
           if gt_imgs is not None and render_factor==0: 
        
               avg_psnr = sum(psnrs)/len(psnrs) 
        
               print("Avg PSNR over Test set: ", avg_psnr) 
        
               with open(os.path.join(savedir, "test_psnrs_avg{:0.2f}.pkl".format(avg_psnr)), "wb") as fp: 
        
                   pickle.dump(psnrs, fp) 
        
           return rgbs, disps

Variable disps is undefined here, do you mean depths? I tried to replace disps with depths and run python run_nerf.py --config configs/chair.txt --finest_res 512 --log2_hashmap_size 19 --lrate 0.01 --lrate_decay 10, but got exception at the entropy calculation:

HashNeRF-pytorch/run_nerf.py

Lines 372 to 375 in 085ae25

    
           try: 
        
               entropy = Categorical(probs = torch.cat([weights, 1.0-weights.sum(-1, keepdim=True)+1e-6], dim=-1)).entropy() 
        
           except: 
        
               pdb.set_trace()

Any ideas on this issue? Thanks in advance.

Can the checkpoint be loaded in instant-ngp's viewer (or any interactive viewer) ?

Hey, Great work on the repo!

Just wanted to ask if there is any straightforward way to load the checkpoint into the original instant-ngp viewer? or if there are any recommendations for generating novel views as one controllably moves around the scene.

some errors in code

File "/data/ytr/conda/envs/direct/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(input, **kwargs)
File "/data/ytr/HashNeRF-pytorch/hash_encoding.py", line 63, in forward
voxel_min_vertex, voxel_max_vertex, hashed_voxel_indices = get_voxel_vertices(
File "/data/ytr/HashNeRF-pytorch/utils.py", line 83, in get_voxel_vertices
hashed_voxel_indices = hash(voxel_indices, log2_hashmap_size)
File "/data/ytr/HashNeRF-pytorch/utils.py", line 18, in hash
return ((1<<log2_hashmap_size)-1) & (x73856093 ^ y19349663 ^ z83492791)
TypeError: unsupported operand type(s) for &: 'int' and 'Tensor'

Can you tell me how to solve it?

training time

So how long will it take during training. Thanks!

"coarse and fine network"?

Hi,

I have take a look at your code and the original instant-ngp code, it seems there is a difference about coarse and fine network.
I'm not sure if my understanding is correct, so I want to ask and make sure.

The corse and fine network structures in the original nerf do not seem to be used in the instant-ngp. The instant-ngp uses "Occupancy Grids" (Appendix E.2) to sample points to avoid sampling points at empty positions.
Your code here is handled differently from instant-ngp, is this understanding correct?

Cheers,
Light

Training is very slow.

After training for just 5k iterations (~10 minutes on a single 1050Ti), you start seeing a crisp chair rendering. :).But I trained 5000 times on Tesla M60 6G, which took about an hour in total.Can you help me?

	disps = np.stack(disps, 0)
	if gt_imgs is not None and render_factor==0:
	avg_psnr = sum(psnrs)/len(psnrs)
	print("Avg PSNR over Test set: ", avg_psnr)
	with open(os.path.join(savedir, "test_psnrs_avg{:0.2f}.pkl".format(avg_psnr)), "wb") as fp:
	pickle.dump(psnrs, fp)

	return rgbs, disps

	try:
	entropy = Categorical(probs = torch.cat([weights, 1.0-weights.sum(-1, keepdim=True)+1e-6], dim=-1)).entropy()
	except:
	pdb.set_trace()