Giter Site home page Giter Site logo

aiyb1314 / infusion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ali-vilab/infusion

0.0 0.0 0.0 6.41 MB

Official implementations for paper: InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior

License: MIT License

Shell 0.24% C++ 6.75% Python 64.69% C 0.19% Cuda 27.71% CMake 0.43%

infusion's Introduction

InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior

Zhiheng Liu* · Hao Ouyang* · Qiuyu Wang · Ka Leong Cheng · Jie Xiao · Kai Zhu · Nan Xue · Yu Liu · Yujun Shen · Yang Cao

Paper PDF Project Page
USTC | HKUST | Ant Group | Alibaba Group

News

  • [2023.4.13] 🔥 Release paper, inference code and pretrained checkpoint.
  • [On-going] Clean and organize the masks corresponding to the dataset used in the experiments.
  • [On-going] Scale-up the model, training data and release stronger models as the foundaition model for downstream tasks.
  • [To-do] Release training code.

Installation

Install with conda:

conda env create -f environment.yaml
conda activate infusion
  • 🛠️ For rendering depth, we use the diff-gaussian-rasterization-confidence from FSGS, thanks to their work! :)

Download Checkpoints

Download Infusion checkpoint and put it in the 'checkpoints' folder:

Data Preparation

Our experiments are conducted on the datasets provided by Mip-NeRF, Instruct-NeRF2NeRF, and SPIn-NeRF. We will upload the masks used in the experiments and the challenge scene we shot ourselves in a few days.

Taking "Garden" in Mip-NeRF as an example, each scene folder should be organized as follows.

Garden
├── images # RGB data
│   ├── DSC07956.JPG
│   ├── DSC07957.JPG
│   └── ...                   
├── seg # Mask 
│   ├── DSC07956.JPG
│   ├── DSC07957.JPG
│   └── ... 
│   # The part that needs to be inpainted is white
└── sparse # Colmap
│   └── 0
│       └── ...
  • 🛠️ You can prepare your own data according to such a structure. In addition, accurate mask is very important. Here we recommend two image segmentation tools: Segment and Track Anything and Grounded SAM.

  • 🛠️ To obtain camera parameters and initial point cloud, please refer to 'convert.py' in Gaussian-Splatting :)

Instructions

The entire pipeline is divided into three stages:

  • Train the Incomplete Gaussians.
  • Inpaint Gaussians via Diffusion Prior.
  • Combine Inpainted Gaussians and Incomplete Gaussians.

🌺 Stage 1

Use pre-annotated masks to train incomplete gaussians.

cd gaussian_splatting
# Train incomplete gaussians
python train.py -s <path to scene folder> -m <path to output floder> -u nothing --mask_training
#--color_aug

# Obtain c2w matrix, intrinsic matrix, incomplete depth and rgb rendering image
python render.py -s <path to scene folder> -m <path to output floder> -u nothing
  • 🛠️ Tip: Sometimes the rendered depth has too many empty points. Maybe you can use --color_aug during training, which will randomly select the background color when rendering depth, which may make the depth map more reliable.

  • 🛠️ Recently, some works focused on how to segment gaussians. This is not the focus of this work, so a relatively simple method was chosen. :)

🌺 Stage 2

Inpaint gaussians using depth inpainting model.

  • 📢 You need to select a single image and in 'path to output floder/train/ours_30000/renders' and mark the area that needs to be inpainted and save it as 'mask.png'. (It doesn’t have to be precise but it needs to cover all the missing parts.)

  • 📢 Next, you need to inpaint a single image. Here are some great tools to inpaint single image: Stable diffusion XL Inpainting and Photoroom. Here is an example:

# Assume that the selected single image is named "DSC07956.JPG".
cd depth_inpainting/run
input_rgb_path=<path to inpainted single image>
input_mask_path=<path to 'mask.png'>
input_depth_path=<path to output floder/train/ours_30000/depth_dis/DSC07956.npy>
c2w=<path to output floder/train/ours_30000/c2w/DSC07956.npy>
intri=<path to output floder/train/ours_30000/intri/DSC07956.npy>
model_path=</path to depth_inpainting model checkpoint>  # absolute path
output_dir=<path to output floder>


CUDA_VISIBLE_DEVICES=0 python run_inference_inpainting.py \
            --input_rgb_path $input_rgb_path \
            --input_mask $input_mask_path \
            --input_depth_path $input_depth_path \
            --model_path $model_path \
            --output_dir $output_dir \
            --denoise_steps 20 \
            --intri $intri \
            --c2w $c2w \
            --use_mask\
            --blend  # Whether to use 'Blended Diffusion (https://arxiv.org/abs/2111.14818)' during inference. 
  • 🛠️ Tip:If you feel that the depth map obtained by one inference is not satisfactory, you can use the newly obtained output_dir/<inpainted_image_name>_depth_dis.npy as the new $input_depth_path and loop two or three times to get better results.

🌺 Stage 3

Combine inpainted gaussians and incomplete gaussians and quickly fine-tune on inpainted single image.

# Assume that the selected single image is named "DSC07956.JPG".
origin_ply="path to output floder/point_cloud/iteration_30000/point_cloud.ply"
supp_ply="path to output floder/DSC07956_mask.ply"
save_ply="path to output floder/point_cloud/iteration_30001/point_cloud.ply"
# Combine inpainted gaussians and incomplete gaussians.
python compose.py --original_ply $origin_ply  --supp_ply $supp_ply --save_ply $save_ply --nb_points 100 --threshold 1.0
# Fine-tune on inpainted single image for 150 iterations.
python train.py -s <path to scene folder> -m <path to output floder> -u DSC07956.JPG -n <path to inpainted single image> --load_iteration 30001 --iteration 150
# Render
python render.py -s <path to scene folder> -m <path to output floder> -u nothing --iteration 150
  • 🛠️ The two parameters --nb_points and --threshold are used to remove floaters near the edges of inpainted gaussians. Increasing their values will remove more surrounding points. Removing floaters is very important for the final rendering results. Here we need to find the most suitable parameters for removing floaters for the scene.

  • 🛠️ As explicit points, Gaussian can be directly edited and cropped in actual applications, such as KIRI Engine

Acknowledgements

This project is developed on the codebase of Gaussian-Splatting, Marigold and Magicboomliu. We appreciate their great works!

Citation

If you find this repository useful in your work, consider citing the following papers and giving a ⭐ to the public repository to allow more people to discover this repo:

@article{liu2024infusion,
      title={InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior},
      author={Zhiheng Liu, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Jie Xiao, Kai Zhu, Nan Xue, Yu Liu, Yujun Shen, Yang Cao},
      journal={arXiv preprint arXiv:2404.11613},
      year={2024}
    }

infusion's People

Contributors

johanan528 avatar alibaba-oss avatar qiuyu96 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.