Giter Site home page Giter Site logo

hitachinsk / isvi Goto Github PK

View Code? Open in Web Editor NEW
57.0 5.0 1.0 4.12 MB

[CVPR 2022] Inertia-Guided Flow Completion and Style Fusion for Video Inpainting

Home Page: https://hitachinsk.github.io/publication/2022-06-01-Inertia-Guided-Flow-Completion-and-Style-Fusion-for-Video-Inpainting

License: MIT License

Python 96.23% C++ 0.86% Cuda 2.75% Shell 0.16%
cvpr2022 video-inpainting

isvi's Introduction

[CVPR 2022] Inertia-Guided Flow Completion and Style Fusion for Video Inpainting

[Paper] / [Demo] / [Project page] / [Poster] / [Intro]

This repository contains the implementation of the following paper:

Inertia-Guided Flow Completion and Style Fusion for Video Inpainting
Kaidong Zhang, Jingjing Fu and Dong Liu
IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022 (CVPR), 2022

Overview

Physical objects have inertia, which resists changes in the velocity and motion direction. Inspired by this, we introduce inertia prior that optical flow, which reflects object motion in a local temporal window, keeps unchanged in the adjacent preceding or subsequent frame. We propose a flow completion network to align and aggregate flow features from the consecutive flow sequences based on the inertia prior. The corrupted flows are completed under the supervision of customized losses on reconstruction, flow smoothness, and consistent ternary census transform. The completed flows with high fidelity give rise to significant improvement on the video inpainting quality. Nevertheless, the existing flow-guided cross-frame warping methods fail to consider the lightening and sharpness variation across video frames, which leads to spatial incoherence after warping from other frames. To alleviate such problem, we propose the Adaptive Style Fusion Network (ASFN), which utilizes the style information extracted from the valid regions to guide the gradient refinement in the warped regions. Moreover, we design a data simulation pipeline to reduce the training difficulty of ASFN. Extensive experiments show the superiority of our method against the state-of-the-art methods quantitatively and qualitatively.

Prerequisites

  • Linux (We tested our codes on Ubuntu18.04)
  • Anaconda
  • Python 3.7.6
  • Pytorch 1.6.0

To get started, first please clone the repo

git clone https://github.com/hitachinsk/ISVI.git

Then, please run the following commands:

conda create -n ISVI
conda activate ISVI
pip install -r requirements.txt
bash install_dependances.sh

Quick start

  1. Download the pre-trained models and the data.
  2. Put the downloaded zip files to the root directory of this project
  3. Run bash prepare_data.sh to unzip the files
  4. Run the object removal demo
cd tool
python video_inpainting.py --path xxx \
--path_mask xxx \
--outroot xxx

If everythings works, you will find a result.mp4 file in xxx. And the video should be like:

License

This work is licensed under MIT license. See the LICENSE for details.

Citation

If our work inspires your research or some part of the codes are useful for your work, please cite our paper:

@InProceedings{Zhang_2022_CVPR,
    author    = {Zhang, Kaidong and Fu, Jingjing and Liu, Dong},
    title     = {Inertia-Guided Flow Completion and Style Fusion for Video Inpainting},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {5982-5991}
}

Our other video inpainting paper FGT and FGT++ (The journal extension of FGT.)

@inproceedings{zhang2022flow,
  title={Flow-Guided Transformer for Video Inpainting},
  author={Zhang, Kaidong and Fu, Jingjing and Liu, Dong},
  booktitle={European Conference on Computer Vision},
  pages={74--90},
  year={2022},
  organization={Springer}
}
@misc{https://doi.org/10.48550/arxiv.2301.10048,
  doi = {10.48550/ARXIV.2301.10048},
  url = {https://arxiv.org/abs/2301.10048},
  author = {Zhang, Kaidong and Peng, Jialun and Fu, Jingjing and Liu, Dong},
  keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Exploiting Optical Flow Guidance for Transformer-Based Video Inpainting},
  publisher = {arXiv},
  year = {2023},
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Contact

If you have any questions, please contact us via

Acknowledgement

Some parts of this repo are based on FGVC and flow forward warp package. And we adopt RAFT for flow estimation.

isvi's People

Contributors

hitachinsk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

ip-restoration

isvi's Issues

about trainning

hello, I wonder how to train your model and evaluate, looking forward to your reply.

Were moving square masks used in qualitative results?

I saw that moving square masks are used in your demo video and qualitative results.
Do you also use the moving square masks to have quantitative results in paper and how did you make the moving masks?

Thanks for reading and your great work:)

taking long time to test compared to FGT

Hi @hitachinsk! Firstly, great work and am very impressed. I liked the performance of ISVI as compared to FGT however I noticed that (for a video of only 60 frames), ISVI is taking over an hour on an A100 80 GB GPU whereas FGT takes just a few minutes.

The issue is mainly in the backward flow where it is interpolating and filling in the missed pixels. E.g:

for indFrame in range(nFrame):
        # Index of missing pixel whose backward flow neighbor is from frame indFrame
        SourceFmInd = np.where(flowNN[:, 2, 0] == indFrame)

        print("{0:8d} pixels are from source Frame {1:3d}"
                        .format(len(SourceFmInd[0]), indFrame))
        # The location of the missing pixel whose backward flow neighbor is
        # from frame indFrame flowNN[SourceFmInd, 0, 0], flowNN[SourceFmInd, 1, 0]

        if len(SourceFmInd[0]) != 0:

Can you help me resolve this please?

Code

Hi, thank you for your great job; the demo is really interesting. Are you going to release the code at some point?

Forward warp module installation

I tried to install "forward_warp/cuda" module following your bash script "install_dependances.sh" and typed below codes,
'''
cd forward_warp/cuda
python setup.py install | grep "error"
'''
However, I got following errors,
image

My machine has 3090ti gpus which only support CUDA>=11.1. So it might not be available to use provided 'forward_warp" module. Are there any alternatives for the module?

Thanks for your great work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.