haochen-rye / nerv Goto Github PK

View Code? Open in Web Editor NEW

278.0 278.0 24.0 292.31 MB

Official Pytorch implementation for video neural representation (NeRV)

Python 100.00%

nerv's People

Contributors

Stargazers

Watchers

nerv's Issues

Full compression pipeline

Is there any script in this repo that apply the whole compression pipeline (pruning-quantization-entropy) described in the paper? I can see pruning is listed in the instructions but most of the time the resulting model file seems even bigger than the original.

Some problems when processing with UVG dataset

Hi, thanks for you impressive work. I'm trying to reproduce your work, but I failed to convert 7 UVG videos into png files and put them into one folder. If possible , can you share command How to merge multiple y4m files into one file.

Why the model size is much bigger than the origin video

Thank you for your work. I tried to review your article and found that your model is much larger than the actual video. The model size is 30MB, but the actual video size is only less than 1MB. I am curious about any method to reduce the model size, as if the model is larger than the original video, this application scenario will be greatly limited

some problems with video denoising

May I ask what type of noise you used in the video denoising experiment? I attempted to replicate the findings presented in Table 5 and Figure 10 by using pepper noise as the black noise, but the outcome was unsatisfactory. Could you please provide more information regarding the video experiment?

Regarding the UVG replication issue.

How do I calculate the bits per pixel (bpp) after training is completed, when I've merged and cropped seven UVG videos into 3900 frames, and there's only one result after training?
Regarding the calculation of bits per pixel (bpp), aren't the model parameters the ones used during training?

Why does the model file get bigger with pruning？

Lack of instructions for decoding

Hello,

I would like to thank you for sharing your work, it's a very interesting concept and I can see a lot of promising research on the subject to be done in the future.

I've got a doubt about the decoding part, is there already a way to convert the resulting neural network back to a sequence of frames?

Bpp calculation

Hello,

I really appreciate for sharing your work.

I have some question about the decoding part. How did you calculate the BPP when encoding with the traditional codec, such as HECV?

In appendix, only the commands to compress videos with H.264 or HEVC codec under medium preset are described.

If you do not mind, please explain the details to calculate the bpp precisely.

Thanks.

Some questions about the experimental details.

This is an impressive job. I try to reproduce the results in the paper. I found that there are two parameters that control the compression ratio, quantization (bit length) and pruning ratio. I was wondering how did you get the curves for the BDBR? Looking forward to your reply.

UVG dataset reproduce

Hello, I have one question about reproducing the results, especially UVG.

To train NeRV for UVG dataset, I set the command as follows:

python train_nerv.py -e 150 --lower-width 96 --num-blocks 1 --dataset PATH --frame_gap 1 --outf bunny_ab --embed 1.25_80 --stem_dim_num 512_1 --reduction 2 --fc_hw_dim 9_16_112 --expansion 1 --single_res --loss Fusion6 --warmup 0.2 --lr_type cosine --strides 5 3 2 2 2 --conv_type conv -b 1 --lr 0.0005 --norm none --act gelu

Is there any suggestion for accurate reproduced results?

Thank you :)

UVG dataset experiment options

Thanks for sharing your research.

I'm trying to reproduce the Figure 7 graph of your paper(PSNR vs. BPP on UVG dataset), but I couldn’t find the appropriate experiment options. Could you tell me (C1,C2) for that result? (among Appendix A.1’s values)

Other options I've tried so far are as follows.

Learning rate:
op1. 5e-4 (paper 4.1)
op2. 5e-4 x 6 (linear rule /w batch size 6)
Up-scale factor: 5,3,2,2,2 (paper 4.1)
Train epochs: 1500 epochs (paper 4.1)
Warmup epochs:
op1. 300 epochs (train code's default. train epochs * 0.2)
op2. 30 epochs (paper 4.1)

RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR

I was trying to run the training script & I faced this error.


Use GPU: None for training
waiting
=> No resume checkpoint found at 'output/bunny_ab/bunny/embed1.25_40_512_1_fc_9_16_26__exp1.0_reduce2_low96_blk1_cycle1_gap1_e300_warm60_b1_conv_lr0.0005_cosine_Fusion6_Strd5,2,2,2,2_SinRes_actswish_/model_latest.pth'
Traceback (most recent call last):
  File "/home/sparsh/event_fit/NeRV/train_nerv.py", line 532, in <module>
    main()
  File "/home/sparsh/event_fit/NeRV/train_nerv.py", line 141, in main
    train(None, args)
  File "/home/sparsh/event_fit/NeRV/train_nerv.py", line 342, in train
    loss_sum.backward()
  File "/home/sparsh/anaconda3/envs/nerv/lib/python3.9/site-packages/torch/_tensor.py", line 307, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/sparsh/anaconda3/envs/nerv/lib/python3.9/site-packages/torch/autograd/__init__.py", line 154, in backward
    Variable._execution_engine.run_backward(
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR
You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.

import torch
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.benchmark = True
torch.backends.cudnn.deterministic = False
torch.backends.cudnn.allow_tf32 = True
data = torch.randn([1, 96, 360, 640], dtype=torch.float, device='cuda', requires_grad=True)
net = torch.nn.Conv2d(96, 384, kernel_size=[3, 3], padding=[1, 1], stride=[1, 1], dilation=[1, 1], groups=1)
net = net.cuda().float()
out = net(data)
out.backward(torch.randn_like(out))
torch.cuda.synchronize()

ConvolutionParams 
    data_type = CUDNN_DATA_FLOAT
    padding = [1, 1, 0]
    stride = [1, 1, 0]
    dilation = [1, 1, 0]
    groups = 1
    deterministic = false
    allow_tf32 = true
input: TensorDescriptor 0x7fa204013f00
    type = CUDNN_DATA_FLOAT
    nbDims = 4
    dimA = 1, 96, 360, 640, 
    strideA = 22118400, 230400, 640, 1, 
output: TensorDescriptor 0x7fa204014420
    type = CUDNN_DATA_FLOAT
    nbDims = 4
    dimA = 1, 384, 360, 640, 
    strideA = 88473600, 230400, 640, 1, 
weight: FilterDescriptor 0x7fa204007fa0
    type = CUDNN_DATA_FLOAT
    tensor_format = CUDNN_TENSOR_NCHW
    nbDims = 4
    dimA = 384, 96, 3, 3, 
Pointer addresses: 
    input: 0x585de6000
    output: 0x58b246000
    weight: 0x50f18c000

As suggested, I ran the code snippet & the error is reproduced.

P.S:
I had to make some changes to the conda env. I am using a machine with RTX3060. Before making the changes, it gave me the following error.

NVIDIA GeForce RTX 3060 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3060 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

So, I installed the conda pytorch package from the official website. My current env looks like this

pytorch                   1.10.2          py3.9_cuda11.3_cudnn8.2.0_0    pytorch
pytorch-msssim            0.2.1                    pypi_0    pypi
pytorch-mutex             1.0                        cuda    pytorch
torchvision               0.11.3               py39_cu113    pytorch
cudatoolkit               11.3.1               h2bc3f7f_2

However, the GPU is still not being used for training (as shown in the first line of the first code snippet in this post).

P.P.S:
Might be irrelevant, but the model is loading into the GPU memory (confirmed by nvidia-smi).

Thanks for the help!

Hyperparameter for UVG Dataset

It's really an interesting and surprsing work!

Thanks for sharing such a great work! I really like this new idea!!!

Possible mistake in ReadMe

Hi,

I think I may have found some mistake in the Readme (or in the paper, but I believe it is just the readme)

From the paper, I see that you choose to prune away 40% of the weights.

From the code, the prune_ratio parameter seems to mean the proportion of weights that are kept, as 1-prune_ratio is passed to the pytorch prune function as the amount parameter.

In the final section of the ReadME, you calculate bpp using 1 - prune_ratio however. Should this not be just prune_ratio? or 1-model_sparsity.

Furthermore, am I correct in thinking the value of the prune_ratio parameter to replicate the results in the paper should be 0.6, unlike the 0.4 in the ReadMe?

Thanks for any clarifications.

Interpolating between two frames

hi,

I'm interested in interpolating between frames. In the appendix A.4 of the paper there is a figure where you interpolate between two seen frames. I tried to reproduced that with your pretrained model but there are strong artifacts in the result:

The only thing i changed was embed_input = pe(norm_idx+(1/132)*0.5) in train_nerv.py l. 484. Is there something I didn't notice for reproducing the interpolation? I would be glad for any hint.

Its a great work and i saw there is a follow up paper in review, that seems to focus that problem too, right? Are there any planes for a approx. release date?

RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`

The outputs of the training on the shell are as bellows.

Use GPU: 0 for training
Use GPU: 1 for training
...
Traceback (most recent call last):
  File "train_nerv.py", line 532, in <module>
    main()
  File "train_nerv.py", line 139, in main
    mp.spawn(train, nprocs=args.ngpus_per_node, args=(args,))
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/mnt/data/isp001/workspaces/nerv/code/train_nerv.py", line 335, in train
    output_list = model(embed_input)
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/data/isp001/workspaces/nerv/code/model_nerv.py", line 175, in forward
    output = self.stem(input)
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 94, in forward
    return F.linear(input, self.weight, self.bias)
  File "/mnt/data/isp001/anaconda3/envs/nerv/lib/python3.8/site-packages/torch/nn/functional.py", line 1753, in linear
    return torch._C._nn.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_INTERNAL_ERROR when calling `cublasCreate(handle)`

Dumped images with different resolution.

The images produced by the network have a different (lower) resolution than the original, which argument controls this?

PS: Also, the output images have a much worse PSNR than during training. Train ~28; Test ~10

Distrotion-Compression result

Hello, I really appreciate your impressive work.

I have one question about calculating bits per pixel.

You mentioned that bpp is calculated as follows: Model_Parameter∗(1−Prune_Ratio)∗Quant_Bit/Pixel_Num

Here, Pixel_Num means what?
For example, if we have a video, which has 100 frames of 720x1280 resolution.

number of frames * width * height (100x720x1280)
width * height (720x1280)

Thanks.

Reproducing results

Missing UVG video identifiers

Hello, thanks again for sharing the results of your research!

I would like to make use of the results listed in psnr_bpp_results.csv for comparison, but I can't figure out which video relates to each line in the CSV file. The UVG dataset itself is bigger than the amount of results (7) listed in the first half.

Could you please add the video names in the first column? Thanks.

About weight pruning and entropy coding

NeRV/train_nerv.py

Line 442 in adf61b8

valid_quant_v = quant_v[v!=0] # only include non-zero weights

When compressing the network's weights by Huffman codes, I confirmed that zero values are excepted.
In this case, we can not know the position of pruned weights when reconstructing the model weights.
I think additional information(such as indices of pruned weight) is required.

Can you explain the details?

Thank you for sharing the brilliant work!

haochen-rye / nerv Goto Github PK

nerv's People

Contributors

Stargazers

Watchers

Forkers

nerv's Issues

Recommend Projects

Recommend Topics

Recommend Org