Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion

This project is the official repository of the ECCV2022 paper: Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion.

Authors: Yiheng Li, Connelly Barnes, Kun Huang, and Fang-Lue Zhang

paper, dataset, video

Abstract

Optical flow computation is essential in the early stages of the video processing pipeline. This paper focuses on a less explored problem in this area, the 360° optical flow estimation using deep neural networks to support increasingly popular VR applications. To address the distortions of panoramic representations when applying convolutional neural networks, we propose a novel multi-projection fusion framework that fuses the optical flow predicted by the models trained using different projection methods. It learns to combine the complementary information in the optical flow results under different projections. We also build the first large-scale panoramic optical flow dataset to support the training of neural networks and the evaluation of panoramic optical flow estimation methods. The experimental results on our dataset demonstrate that our method outperforms the existing methods and other alternative deep networks that were developed for processing 360° content.

Poster

Video

A video presentation is available at Vimeo. Please have a look.

For those who can not watch the video due to network issue, we have a compressed video available at poster_and_video/video.mp4.

Citation

@misc{2208.00776,
Author = {Yiheng Li and Connelly Barnes and Kun Huang and Fang-Lue Zhang},
Title = {Deep 360$^\circ$ Optical Flow Estimation Based on Multi-Projection Fusion},
Year = {2022},
Eprint = {arXiv:2208.00776},
}

Requirements

We mainly borrowed code from the PWC optical flow neural network and updated it to the newest Pytorch version. Furthermore, in order to accelerate the process of converting different projections. We also require C++ and OpenCL environment for parallel computing.

Python side:

Please refer to the requirements.txt. It requires: Python=3.9 Pytorch=1.12 CUDA=11.6 OpenCV NumPy and the nvcc toolchain

C++/OpenCL side:

Please refer to the CMakeLists.txt. It requires: cmake>=3.4 cc/c++ opencv=4.6 opencl=2.2 pybind11

Installation

On Ubuntu 20.04, enter the project folder and execute the script install.sh. It will install the CUDA operator.
build the OpenCL code and port the dynamic file to the project folder

mkdir build && cd build && cmake .. && make -j8

Inference

Please update the input arg in the end_to_end_inference.py for the model path and the dataset path.

python end_to_end_inference.py

Contact

Please feel free to contact me (Yiheng) at [email protected]. Or raise your query in the GitHub issue :)

Known issues

I have optimized the projection algorithm from pure C++ to OpenCL, and it caused some loss in precision. Further work will include adding an end-to-end training script and fine-tuning some fusion models.

yannnnnnnnnnnn / eccv2022-mpf-net Goto Github PK

eccv2022-mpf-net's Introduction

Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion

Abstract

Poster

Video

Citation

Requirements

Python side:

C++/OpenCL side:

Installation

Inference

Contact

Known issues

eccv2022-mpf-net's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent