haochen-rye / nerv Goto Github PK
View Code? Open in Web Editor NEWOfficial Pytorch implementation for video neural representation (NeRV)
Official Pytorch implementation for video neural representation (NeRV)
Is there any script in this repo that apply the whole compression pipeline (pruning-quantization-entropy) described in the paper? I can see pruning is listed in the instructions but most of the time the resulting model file seems even bigger than the original.
Hi, thanks for you impressive work. I'm trying to reproduce your work, but I failed to convert 7 UVG videos into png files and put them into one folder. If possible , can you share command How to merge multiple y4m files into one file.
Thank you for your work. I tried to review your article and found that your model is much larger than the actual video. The model size is 30MB, but the actual video size is only less than 1MB. I am curious about any method to reduce the model size, as if the model is larger than the original video, this application scenario will be greatly limited
May I ask what type of noise you used in the video denoising experiment? I attempted to replicate the findings presented in Table 5 and Figure 10 by using pepper noise as the black noise, but the outcome was unsatisfactory. Could you please provide more information regarding the video experiment?
Hello,
I would like to thank you for sharing your work, it's a very interesting concept and I can see a lot of promising research on the subject to be done in the future.
I've got a doubt about the decoding part, is there already a way to convert the resulting neural network back to a sequence of frames?
Hello,
I really appreciate for sharing your work.
I have some question about the decoding part. How did you calculate the BPP when encoding with the traditional codec, such as HECV?
In appendix, only the commands to compress videos with H.264 or HEVC codec under medium preset are described.
If you do not mind, please explain the details to calculate the bpp precisely.
Thanks.
This is an impressive job. I try to reproduce the results in the paper. I found that there are two parameters that control the compression ratio, quantization (bit length) and pruning ratio. I was wondering how did you get the curves for the BDBR? Looking forward to your reply.
Hello, I have one question about reproducing the results, especially UVG.
To train NeRV for UVG dataset, I set the command as follows:
python train_nerv.py -e 150 --lower-width 96 --num-blocks 1 --dataset PATH --frame_gap 1 --outf bunny_ab --embed 1.25_80 --stem_dim_num 512_1 --reduction 2 --fc_hw_dim 9_16_112 --expansion 1 --single_res --loss Fusion6 --warmup 0.2 --lr_type cosine --strides 5 3 2 2 2 --conv_type conv -b 1 --lr 0.0005 --norm none --act gelu
Is there any suggestion for accurate reproduced results?
Thank you :)
Thanks for sharing your research.
I'm trying to reproduce the Figure 7 graph of your paper(PSNR vs. BPP on UVG dataset), but I couldn’t find the appropriate experiment options. Could you tell me (C1,C2) for that result? (among Appendix A.1’s values)
Other options I've tried so far are as follows.
I was trying to run the training script & I faced this error.
Use GPU: None for training
waiting
=> No resume checkpoint found at 'output/bunny_ab/bunny/embed1.25_40_512_1_fc_9_16_26__exp1.0_reduce2_low96_blk1_cycle1_gap1_e300_warm60_b1_conv_lr0.0005_cosine_Fusion6_Strd5,2,2,2,2_SinRes_actswish_/model_latest.pth'
Traceback (most recent call last):
File "/home/sparsh/event_fit/NeRV/train_nerv.py", line 532, in <module>
main()
File "/home/sparsh/event_fit/NeRV/train_nerv.py", line 141, in main
train(None, args)
File "/home/sparsh/event_fit/NeRV/train_nerv.py", line 342, in train
loss_sum.backward()
File "/home/sparsh/anaconda3/envs/nerv/lib/python3.9/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/sparsh/anaconda3/envs/nerv/lib/python3.9/site-packages/torch/autograd/__init__.py", line 154, in backward
Variable._execution_engine.run_backward(
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR
You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.
import torch
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.benchmark = True
torch.backends.cudnn.deterministic = False
torch.backends.cudnn.allow_tf32 = True
data = torch.randn([1, 96, 360, 640], dtype=torch.float, device='cuda', requires_grad=True)
net = torch.nn.Conv2d(96, 384, kernel_size=[3, 3], padding=[1, 1], stride=[1, 1], dilation=[1, 1], groups=1)
net = net.cuda().float()
out = net(data)
out.backward(torch.randn_like(out))
torch.cuda.synchronize()
ConvolutionParams
data_type = CUDNN_DATA_FLOAT
padding = [1, 1, 0]
stride = [1, 1, 0]
dilation = [1, 1, 0]
groups = 1
deterministic = false
allow_tf32 = true
input: TensorDescriptor 0x7fa204013f00
type = CUDNN_DATA_FLOAT
nbDims = 4
dimA = 1, 96, 360, 640,
strideA = 22118400, 230400, 640, 1,
output: TensorDescriptor 0x7fa204014420
type = CUDNN_DATA_FLOAT
nbDims = 4
dimA = 1, 384, 360, 640,
strideA = 88473600, 230400, 640, 1,
weight: FilterDescriptor 0x7fa204007fa0
type = CUDNN_DATA_FLOAT
tensor_format = CUDNN_TENSOR_NCHW
nbDims = 4
dimA = 384, 96, 3, 3,
Pointer addresses:
input: 0x585de6000
output: 0x58b246000
weight: 0x50f18c000
As suggested, I ran the code snippet & the error is reproduced.
P.S:
I had to make some changes to the conda env. I am using a machine with RTX3060. Before making the changes, it gave me the following error.
NVIDIA GeForce RTX 3060 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3060 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
So, I installed the conda pytorch package from the official website. My current env looks like this
pytorch 1.10.2 py3.9_cuda11.3_cudnn8.2.0_0 pytorch
pytorch-msssim 0.2.1 pypi_0 pypi
pytorch-mutex 1.0 cuda pytorch
torchvision 0.11.3 py39_cu113 pytorch
cudatoolkit 11.3.1 h2bc3f7f_2
However, the GPU is still not being used for training (as shown in the first line of the first code snippet in this post).
P.P.S:
Might be irrelevant, but the model is loading into the GPU memory (confirmed by nvidia-smi
).
Thanks for the help!
Thanks for sharing such a great work! I really like this new idea!!!
Hi,
I think I may have found some mistake in the Readme (or in the paper, but I believe it is just the readme)
From the paper, I see that you choose to prune away 40% of the weights.
From the code, the prune_ratio parameter seems to mean the proportion of weights that are kept, as 1-prune_ratio is passed to the pytorch prune function as the amount parameter.
In the final section of the ReadME, you calculate bpp using 1 - prune_ratio however. Should this not be just prune_ratio? or 1-model_sparsity.
Furthermore, am I correct in thinking the value of the prune_ratio parameter to replicate the results in the paper should be 0.6, unlike the 0.4 in the ReadMe?
Thanks for any clarifications.
hi,
I'm interested in interpolating between frames. In the appendix A.4 of the paper there is a figure where you interpolate between two seen frames. I tried to reproduced that with your pretrained model but there are strong artifacts in the result:
The only thing i changed was embed_input = pe(norm_idx+(1/132)*0.5)
in train_nerv.py l. 484. Is there something I didn't notice for reproducing the interpolation? I would be glad for any hint.
Its a great work and i saw there is a follow up paper in review, that seems to focus that problem too, right? Are there any planes for a approx. release date?
The outputs of the training on the shell are as bellows.
Use GPU: 0 for training
Use GPU: 1 for training
...
Traceback (most recent call last):
File "train_nerv.py", line 532, in <module>
main()
File "train_nerv.py", line 139, in main
mp.spawn(train, nprocs=args.ngpus_per_node, args=(args,))
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
while not context.join():
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:
-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
fn(i, *args)
File "/mnt/data/isp001/workspaces/nerv/code/train_nerv.py", line 335, in train
output_list = model(embed_input)
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/data/isp001/workspaces/nerv/code/model_nerv.py", line 175, in forward
output = self.stem(input)
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 94, in forward
return F.linear(input, self.weight, self.bias)
File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`
The images produced by the network have a different (lower) resolution than the original, which argument controls this?
PS: Also, the output images have a much worse PSNR than during training. Train ~28; Test ~10
Hello, I really appreciate your impressive work.
I have one question about calculating bits per pixel.
You mentioned that bpp is calculated as follows: Model_Parameter∗(1−Prune_Ratio)∗Quant_Bit/Pixel_Num
Here, Pixel_Num means what?
For example, if we have a video, which has 100 frames of 720x1280 resolution.
Thanks.
Hello, thanks again for sharing the results of your research!
I would like to make use of the results listed in psnr_bpp_results.csv for comparison, but I can't figure out which video relates to each line in the CSV file. The UVG dataset itself is bigger than the amount of results (7) listed in the first half.
Could you please add the video names in the first column? Thanks.
Line 442 in adf61b8
When compressing the network's weights by Huffman codes, I confirmed that zero values are excepted.
In this case, we can not know the position of pruned weights when reconstructing the model weights.
I think additional information(such as indices of pruned weight) is required.
Can you explain the details?
Thank you for sharing the brilliant work!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.