Giter Site home page Giter Site logo

ricklentz / spin-nerf Goto Github PK

View Code? Open in Web Editor NEW

This project forked from samsunglabs/spin-nerf

0.0 0.0 0.0 17.47 MB

3D Scene Inpainting with NeRFs

License: Other

Shell 0.30% C++ 0.15% Python 18.41% C 0.02% Cuda 0.15% Jupyter Notebook 80.94% Dockerfile 0.03%

spin-nerf's Introduction

SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields

Project | Paper | YouTube | Dataset

Pytorch implementation of SPIn-NeRF. SPIn-NeRF leverages 2D priors from image inpainters, and enables view-consistent inpainting of NeRFs.


Quick Start

Dependencies

After installing Pytorch according to your CUDA version, install the rest of the dependencies:

pip install -r requirements.txt

Also, install LaMa dependencies with the following:

pip install -r lama/requirements.txt

You will also need COLMAP installed to compute poses if you want to run on your data.

Dataset preparation

Download the zip files of the dataset from here. Extract them under /data. Here, we provide information for running the statue scene. For other scenes, a similar approach with potentially different factor can be done.

Extract statue.zip under /data. This can be done with unzip statue.zip -d data. You might need to install unzip with sudo apt-get install unzip.

If you want to use your own data, make sure that you put your data in a folder under data with the following format (Note that labels under statue/images_2/label are 1 where we need inpainting, and 0 otherwise):

statue
├── images
│   ├── IMG_2707.jpg
│   ├── IMG_2708.jpg
│   ├── ...
│   └── IMG_2736.jpg
└── images_2
    ├── IMG_2707.png
    ├── IMG_2708.png
    ├── ...
    ├── IMG_2736.png
    └── label
        ├── IMG_2707.png
        ├── IMG_2708.png
        ├── ...
        └── IMG_2736.png

where in this example, we want to use --factor 2 for the images to use 2x downsized images for the fitting, thus we have put 2x downsized images under images_2. If your original images are larger, put the original images under images, and the Nx downsized images under images_N, where N is chosen based on your GPU availabitlity. Also, make sure to obtain camera parameters using COLMAP. This can be done with the following command:

python imgs2poses.py <your_datadir>

For example, for the sample statue dataset, the camera parameters can be obtained as python imgs2poses.py data/statue. Note that for this specific dataset, we have already provided the camera parameters and you can skip running COLMAP.

Running an initial NeRF for getting the depths

First, use the following command to render disparities from the training views. This can be done with the following:

rm -r LaMa_test_images/*
rm -r output/label/*
python DS_NeRF/run_nerf.py --config DS_NeRF/configs/config.txt --render_factor 1 --prepare --i_weight 1000000000 --i_video 1000000000 --i_feat 4000 --N_iters 4001 --expname statue --datadir ./data/statue --factor 2 --N_gt 0

After this, rendered disparities (inverse depths) are ready at lama/LaMa_test_images, with their corresponding labels at lama/LaMa_test_images/label.

Running LaMa to generate geometry and appearance guidance

First, let's run LaMa to generate depth priors:

cd lama

Now, make sure to follow the LaMa instructions for downloading the big-lama model.

export TORCH_HOME=$(pwd) && export PYTHONPATH=$(pwd)
python bin/predict.py refine=True model.path=$(pwd)/big-lama indir=$(pwd)/LaMa_test_images outdir=$(pwd)/output

Now, the inpainted disparities are ready at lama/output/label. Copy the images and put the under data/statue/images_2/depth. It can be done with the following:

dataset=statue
factor=2

rm -r ../data/$dataset/images_$factor/depth
mkdir ../data/$dataset/images_$factor/depth
cp ./output/label/*.png ../data/$dataset/images_$factor/depth

Now, let's generate the inpainted RGB images:

dataset=statue
factor=2

rm -r LaMa_test_images/*
rm -r output/label/*
cp ../data/$dataset/images_$factor/*.png LaMa_test_images
mkdir LaMa_test_images/label
cp ../data/$dataset/images_$factor/label/*.png LaMa_test_images/label
python bin/predict.py refine=True model.path=$(pwd)/big-lama indir=$(pwd)/LaMa_test_images outdir=$(pwd)/output
rm -r ../data/$dataset/images_$factor/lama_images
mkdir ../data/$dataset/images_$factor/lama_images
cp ../data/$dataset/images_$factor/*.png ../data/$dataset/images_$factor/lama_images
cp ./output/label/*.png ../data/$dataset/images_$factor/lama_images

The inpainted RGB images are now ready under lama/output/label, and have been copied to data/statue/images_2/lama_images.

statue
├── colmap_depth.npy
├── images
│   ├── IMG_2707.jpg
│   ├── ...
│   └── IMG_2736.jpg
├── images_2
│   ├── depth
│   │   ├── img000.png
│   │   ├── ...
│   │   └── img028.png
│   ├── IMG_2707.png
│   ├── IMG_2708.png
│   ├── ...
│   ├── IMG_2736.png
│   ├── label
│   │   ├── IMG_2707.png
│   │   ├── ... 
│   │   ├── IMG_2736.png
│   └── lama_images
│       ├── IMG_2707.png
│       ├── ...
│       └── IMG_2736.png
└── sparse

Let's move back to the main directory by cd ...

Running multiview inpainter

Now, using the following command, the optimization of the final inpainted NeRF will be started. A video of the inpainted NeRF will be saved every i_video iterations. The fitting will be done for N_iters iterations. A sample rendering from a random view point is saved to /test_renders every i_feat iterations, which can be used for early sanity checks and hyper-parameter tunings.

python DS_NeRF/run_nerf.py --config DS_NeRF/configs/config.txt --i_feat 200 --lpips --i_weight 1000000000000 --i_video 1000 --N_iters 10001 --expname statue --datadir ./data/statue --N_gt 0 --factor $factor

Note that our experiments were done on Nvidia A6000 GPUs. In case of running on GPUs with lower memory, you might get out-of-memory errors. To prevent that, please try increasing the arguments --lpips_render_factor and --patch_len_factor, or reducing --lpips_batch_size.

Notes on mask dilation

Please note that as mentioned in the paper, the masks are dilated by default with a 5x5 kernel for 5 iterations to ensure that all of the object is masked, and that the effects of the shadow of the unwanted objects on the scene is reduced. If you wish to alter the dilation, first, you need to change the dilations applied by the LaMa model to generate the inpaintings under lama/saicinpainting/evaluation/refinement.py at the following line:

tmp = cv2.dilate(tmp.cpu().numpy().astype('uint8'), np.ones((5, 5), np.uint8), iterations=5)

Then, you also need to change the LLFF loader to load the masks with proper dilations applied to them under DS_NeRF/load_llff.py. In this file, the following line is responsible for the dilations:

msk = cv2.dilate(msk, np.ones((5, 5), np.uint8), iterations=5)

BibTeX

If you find SPIn-NeRF useful in your work, please consider citing it:

@inproceedings{spinnerf,
      title={{SPIn-NeRF}: Multiview Segmentation and Perceptual Inpainting with Neural Radiance Fields}, 
      author={Ashkan Mirzaei and Tristan Aumentado-Armstrong and Konstantinos G. Derpanis and Jonathan Kelly and Marcus A. Brubaker and Igor Gilitschenski and Alex Levinshtein},
      year={2023},
      booktitle={CVPR},
}

spin-nerf's People

Contributors

ashmrz avatar ashkan-mirzaei avatar spinnerf3d avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.