Giter Site home page Giter Site logo

wenbin-lin / occlusionfusion Goto Github PK

View Code? Open in Web Editor NEW
255.0 24.0 41.0 27.28 MB

OcclusionFusion: realtime dynamic 3D reconstruction based on single-view RGB-D

Home Page: https://wenbin-lin.github.io/OcclusionFusion/

Python 100.00%
rgbd cvpr2022 3d-reconstruction

occlusionfusion's Introduction

OcclusionFusion (CVPR'2022)

Overview

This repository contains the code for the CVPR 2022 paper OcclusionFusion, where we introduce a novel method to calculate occlusion-aware 3D motion to guide dynamic 3D reconstruction.

In our technique, the motion of visible regions is first estimated and combined with temporal information to infer the motion of the occluded regions through an LSTM-involved graph neural network.

Currently, we provide a pretrained model and a demo. Code for data pre-processing, network training and evaluation will be available soon.

Setup

We use python 3.8.10, pytorch-1.8.0 and pytorch-geometric-1.7.2.

conda create -n occlusionfu python==3.8.10
conda activate occlusionfu
pip install -r requirements.txt
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 -c pytorch
pip install torch-scatter==2.0.8 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu102.html
pip install torch-sparse==0.6.12 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu102.html
pip install torch-cluster==1.5.9 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu102.html
pip install torch-spline-conv==1.2.1 -f https://pytorch-geometric.com/whl/torch-1.8.0+cu102.html
pip install torch-geometric==1.7.2

Running the demo

Run the demo with the pretrained model and prepared inputs:

python demo.py

Visualize the input and output:

python visualize.py

The defualt setting of visualize.py will render the network's input and output to a video as follow. You can also change the setting to view the network's input and output with Open3D viewer.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{lin2022occlusionfusion,
    title={OcclusionFusion: Occlusion-aware Motion Estimation for Real-time Dynamic 3D Reconstruction}, 
    author={Wenbin Lin, Chengwei Zheng, Jun-Hai Yong, Feng Xu}, 
    journal={Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year={2022}
} 

occlusionfusion's People

Contributors

wenbin-lin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

occlusionfusion's Issues

down sample & up sample?More information?

Is it a random process in down sample & up sample?Or you process by another way ? I am confused for it and did't get useful information in your arXiv paper.
And Is it feasible to OPEN SOURCES more information about some details about your code release?

Code for geometry fusion

Hi! Thanks for your great work! Your results look amazing!

I've tried your demo code and got the output nodes. The results look super smooth!

I'm very interested in reproducing the whole pipeline of your algorithm. Apparently, some parts are still missing now. For example, the input of your demo codes are the visible nodes and they are already matched with the complete node graph. In your paper, you guys mention that the geometry fusion part is based on Dynamic Fusion. So I'm wondering if you guys plan to release the code that fuses the motion nodes to the canonical volume? I know the codes might be messy, so if these TSDF-related codes are not in your release plan, could you guide me to a repo that is the closest to you guys' implementation?

I do find a very early implementation of DynamicFusion here, https://github.com/mihaibujanca/dynamicfusion. But it seems like all the libraries are not updated. Could you offer me some guidance on the choice of the overall pipeline, please?

Thanks again.

Fitting the RAFT model 16GB model on 2080Ti?

Hey, A question about the optical flow training model, was the RAFT model in a way shrunk down to fit it on the 11GB GPU that you mention in paper?

If so, will the code for training the RAFT optical flow model also be released?

Background subtraction

Hi! Thank you for the great work! Your real-time results look amazing!

I have a question about the depth image data. It seems like in all of your reconstructed results, the backgrounds (for example the walls) are all removed. May I ask how you guys do it? I mean, it seems like optical flow can solve part of the problem. But optical flows are generated from color images, right? I assume they are not perfect. For example, if you pick all the points that have flow values u^2+v^2 > 1, there will always be some background pixels included in the masked area. Do you set up a threshold for input depth values so you subtract the background in the very beginning? Or do you remove it after you calculated the optical flow? Or do you build everything in the canonical model anyway, you just not visualize it in the experiment? Or anything else?

In some other cases, the person may not move very drastically, so the optical flow may ignore a large part of the person. Do you run into similar problems? Any idea how I can solve this?

Thanks again.

Question on OpticalFlow to Node motion

Congrats on your great work! I am currently trying to re-implement your work. I am faced with a problem: you said in your paper that you generate 3D notion motion from optical flow image. I currently can think up of two ways to do it:

  1. Project node position to the optical flow image, and read value at that pixel.
  2. Compute the motion of each vertex, and compute node motion by averaging the motion of its nearby vertex.

Which one should I choose?

Best,
Haonan

How to decrease memory usage?

This is a really great project, we try to run the demo in our PC (NVIDIA GeForce RTX 3080 Ti).
But we encountered this error, can anyone tell me how to resolve this issue.
we have tried setting max_split_size_mb but not worked.

File "E:\projects\OcclusionFusion\model.py", line 89, in forward
  feature7 = self.layer72(feature7, edge_indexes[0])
...
RuntimeError: CUDA out of memory. Tried to allocate 2.00 MiB (GPU 0; 12.00 GiB total capacity; 11.10 GiB already allocated; 0 bytes free; 11.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

graph 和 node数据如何生成

尊敬的作者:
感谢您开源此程序,不知道您的graph 和 node数据如何生成,使用深度摄像头中提取的吗?

Rotation term during post-processing

Hi @wenbin-lin,
Thanks for releasing the motion completion module in OcclusionFusion.
I had a doubt regarding the implementation of graph-based ARAP deformation. The paper mentions the use of ARAP term from Embedded Deformation. That it is similar to the post processing applied by 4DComplete.
4D complete uses ARAP on all vertices of the mesh, whereas in OcclusionFusion only graph nodes are used. Hence during optimisation the rotation term will not update. Can you also share the code for post-processing ?

Otherwise can you explain the ARAP in greater detail. Like the loss term, what parameters are updated, number of iterations, etc.
Also during reconstruction is the post processing performed before or after the optimisation?

Code Release

Great work. Are you planning on making the source code available?
Thank you

Complete node graph creation in DeepDeform and Live Demo

Thanks for sharing this amazing work!
I had a doubt regarding the complete node graph used as input for Occlusion-aware Motion Estimation Network module.

Unlike DeformingThings4D where the complete object surface is known. In datasets like Deepdeform or for live demo, only front-view is available. In these cases what is the input to the module?

  1. Is complete object surface precomputed (maybe by DynamicFusion)?
  2. Or only the graph extracted from front-view RGBD Image at frame t_0 is used. All the confidence and visibility scores are computed on this graph and no graph update is made during the motion estimation step?

Question on multi person reconstruct.

Hi, could Occlusion Fusion support multi person reconstruct? If not, it is feasible to cutting every person bounding box by detection and then forward multi tensors by your model?

About the dataset version of FlyingThings3D

Thanks for your wonderful work.
I am trying to reproduce your modification on optical flow model RAFT and stumble upon a choice between the full version and subset version (FlowNet2.0 uses this) so I want to ask which version did you use?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.