HNeRV: A Hybrid Neural Representation for Videos (CVPR 2023)

Project Page | UVG Data

Hao Chen, Matthew Gwilliam, Ser-Nam Lim, Abhinav Shrivastava
This is the official implementation of the paper "HNeRV: A Hybrid Neural Representation for Videos".

TODO

[ ✓ ] Video inpainting
[ ✓ ] Fast loading from video checkpoints
Upload results and checkpoints for UVG

Method overview

Get started

We run with Python 3.8, you can set up a conda environment with all dependencies like so:

pip install -r requirements.txt

High-Level structure

The code is organized as follows:

train_nerv_all.py includes a generic traiing routine.
model_all.py contains the dataloader and neural network architecure
data/ directory video/imae dataset, we provide bunny frames here
checkpoints/ we provide model weights, and quantized video checkpoints for bunny here
log files (tensorboard, txt, state_dict etc.) will be saved in output directory (specified by --outf)
We provide numerical results for distortion-compression at uvg_results and per_video_results .

Reproducing experiments

Training HNeRV

HNeRV of 1.5M is specified with '--modelsize 1.5', and we balance parameters with '-ks 0_1_5 --reduce 1.2'

python train_nerv_all.py --gpus 1 --outf 1120  --data_path data/bunny --vid bunny   \
   --conv_type convnext pshuffel --act gelu --norm none  --crop_list 640_1280  \
    --resize_list -1 --loss L2  --enc_strds 5 4 4 2 2 --enc_dim 64_16 \
    --dec_strds 5 4 4 2 2 --ks 0_1_5 --reduce 1.2   \
    --modelsize 1.5  -e 300 --eval_freq 30  --lower_width 12 -b 2 --lr 0.001

NeRV baseline

NeRV baseline is specified with '--embed pe_1.25_80 --fc_hw 8_16', with imbalanced parameters '--ks 0_3_3 --reduce 2'

python train_nerv_all.py  --outf 1120  --data_path data/bunny --vid bunny   \
   --conv_type convnext pshuffel --act gelu --norm none  --crop_list 640_1280  \
   --resize_list -1 --loss L2   --embed pe_1.25_80 --fc_hw 8_16 \
    --dec_strds 5 4 2 2 --ks 0_3_3 --reduce 2   \
    --modelsize 1.5  -e 300 --eval_freq 30  --lower_width 12 -b 2 --lr 0.001

Evaluation & dump images and videos

To evaluate pre-trained model, use '--eval_only --weight [CKT_PATH]' to evaluate and specify model path.
For model and embedding quantization, use '--quant_model_bit 8 --quant_embed_bit 6'.
To dump images or videos, use '--dump_images --dump_videos'.

python train_nerv_all.py  --gpus 1 --outf 1120  --data_path data/bunny --vid bunny   \
   --conv_type convnext pshuffel --act gelu --norm none  --crop_list 640_1280  \
    --resize_list -1 --loss L2  --enc_strds 5 4 4 2 2 --enc_dim 64_16 \
    --dec_strds 5 4 4 2 2 --ks 0_1_5 --reduce 1.2  \
    --modelsize 1.5  -e 300 --eval_freq 30  --lower_width 12 -b 2 --lr 0.001 \
   --eval_only --weight checkpoints/hnerv-1.5m-e300.pth \
   --quant_model_bit 8 --quant_embed_bit 6 \
    --dump_images --dump_videos

Video inpainting

We can specified inpainting task with '--vid bunny_inpaint_50' where '50' is the mask size.

python train_nerv_all.py  --outf 1120  --data_path data/bunny --vid bunny_inpaint_50   \
   --conv_type convnext pshuffel --act gelu --norm none  --crop_list 640_1280  \
    --resize_list -1 --loss L2  --enc_strds 5 4 4 2 2 --enc_dim 64_16 \
    --dec_strds 5 4 4 2 2 --ks 0_1_5 --reduce 1.2   \
    --modelsize 1.5  -e 300 --eval_freq 30  --lower_width 12 -b 2 --lr 0.001

Efficient video loading

We can load video efficiently from a tiny checkpoint.
Specify decoder and checkpoint by '--decoder [Decoder_path] --ckt [Video checkpoint]', output dir and frames by '--dump_dir [out_dir] --frames [frame_num]'.

python efficient_nvloader.py --frames 16

Citation

If you find our work useful in your research, please cite:

@InProceedings{chen2023hnerv,
      title={{HN}e{RV}: Neural Representations for Videos}, 
      author={Hao Chen and Matthew Gwilliam and Ser-Nam Lim and Abhinav Shrivastava},
      year={2023},
      booktitle={CVPR},
}

Contact

If you have any questions, please feel free to email the authors: [email protected]

fork-for-modify / hnerv Goto Github PK

hnerv's Introduction

HNeRV: A Hybrid Neural Representation for Videos (CVPR 2023)

Project Page | UVG Data

TODO

Method overview

Get started

High-Level structure

Reproducing experiments

Training HNeRV

NeRV baseline

Evaluation & dump images and videos

Video inpainting

Efficient video loading

Citation

Contact

hnerv's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent