Giter Site home page Giter Site logo

instant-angelo's Introduction

Instant-angelo: Build high-fidelity Digital Twin within 20 Minutes!

Introduction

Neuralangelo facilitates high-fidelity 3D surface reconstruction from RGB video captures. It enables the creation of digital replicas of both small-scale objects and large real-world scenes, derived from common mobile devices. These digital replicas, or 'twins', are represented with an exceptional level of three-dimensional geo-detail.

Nevertheless, substantial room for improvement exists. At present, the official and reimplemented Neuralangelo implementation requires 40 hours and 40 GB on an A100 for real world scene reconstructions. An expedited variant in instant-nsr has been developed, but the results have been subpar due to parameter limitations.

To fill this gap in high-speed, high-fidelity reconstruction, our objective is to engineer an advanced iteration of Neuralangelo. This refined model will focus on high-fidelity neural surface reconstruction, streamlining the process to achieve results within an unprecedented 20 minute timeline while maintaining the highest standard of quality.

We provide Quick Lookup examples of project outcomes. These examples can serve as a reference to help determine if this project is suitable for your use case scenario.

Installation

pip install torch torchvision
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
pip install -r requirements.txt

For COLMAP, alternative installation options are also available on the COLMAP website

Data preparation

To extract COLMAP data from custom images, you must first have COLMAP installed (you can find installation instructions [here]). Afterwards, place your images in the images/ folder. The structure of your data should be as follows:

-data_001
    -images
    -mask (optional)
-data_002
    -images
    -mask (optional)
-data_003
    -images
    -mask (optional)

Each separate data folder houses its own images folder.

If you have the mask, we recommend filtering the colmap sparse point using it before starting the reconstruction. You can use the following manual script for preprocessing:

python scripts/run_colmap.py ${INPUT_DIR}
python scripts/filter_colmap.py --data ${INPUT_DIR} --output-dir ${INPUT_DIR}_filtered

In the script, ${INPUT_DIR} should be replaced with the actual directory path where your data is located.

The first line runs the colmap reconstruction script with full image. The second line filters the colmap sparse point using the specified mask, and saves the filtered data in a new output directory with the suffix "_filtered".

Start Reconstruction!

Run Smooth Surface Reconstruction in 20 Minutes

[Click to expand] The smooth reconstruction mode is well-suited for the following cases:
  • When reconstructing a smooth object that does not have a high level of detail. The smooth mode works best for objects that have relatively simple, flowing surfaces without a lot of intricate features.

  • When you want a higher-fidelity substitute for instant-nsr that takes a similar amount of time (within 20 minute) to generate but with fewer holes in the resulting model.


Information you need to know before you start:

  • The smooth reconstruction mode's reliance on curvature loss can over-smooth geometry, failing to capture flat surface structures and subtle variations on flatter regions of the original object.
  • This mode relies on sparse points generated by colmap to guide the geometry in the early stage of training. However, SFM (Structure from Motion) can sometimes generate noisy point clouds due to factors such as repeated texture, inaccurate poses, or incorrect point matches. To address this issue, one possible solution is to utilize more powerful SFM tools like hloc or DetectorFreeSfM. Additionally, post-processing techniques can be employed to further refine the point cloud. For example, using methods like Radius Outlier Removal in Open3D or pixsfm can help eliminate outliers and improve the quality of the point cloud.

Now it is time to start by running:

bash run_neuralangelo-colmap_sparse.sh ${INPUT_DIR}

This script is designed to automate the process of running SFM without the need for any preparation beforehand. It will automatically initiate the reconstruction process and export the resulting mesh. The output files will be saved in the logs directory.

If mask is avaible and placed at the right place under data_folder you could start by running:

bash run_neuralangelo-colmap_sparse.sh ${INPUT_DIR}_filtered

Additionally, we have developed an experimental version called SH-neuralangelo, which utilizes Spherical Harmonics (SH) instead of Multilayer Perceptron (MLP) for radiance field. SH-neuralangelo is inspired by Plenoxel and Gaussian Splatting, incorporating progressive Spherical Harmonics for faster convergence and better coefficient regulation.

bash run_SH-neuralangelo-colmap_sparse.sh ${INPUT_DIR}

However, currently, SH-Neus is inferior to the original Neus with MLP in terms of PSNR and reconstruction quality. We are actively working on improving its quality and plan to support exporting Spherical Harmonics coefficients for real-time viewers in the future, similar to Gaussian Splatting.

Run Detail Surface Reconstruction in 1 Hour

[Click to expand]

The detail reconstruction mode without additional preprocessing is optimal for scenarios with:

  • Image data captured under varying conditions over time or with inconsistent exposure levels.
  • High resolution image sources of 2K or 4K dimensions.
  • Your images' resolution are 2K or 4K
  • Reconstructing objects or scenes comprised of glossy, reflective materials.
  • Subjects containing large textureless or untextured surface regions.

Information you need to know before you start:

  • The detail reconstruction mode requires 2-3 times longer to complete compared to the smooth mode, owing to the use of a larger final hash grid resolution and more training steps.
  • For image inputs below 1K resolution, the detail mode may yield marginal improvements over other modes. Images under 1K likely do not provide sufficient information to take full advantage of the capabilities of detail reconstruction.

Now it is time to start by running:

bash run_neuralangelo-colmap_sparse-50k.sh  ${INPUT_DIR}

Run Detail Surface Reconstruction in 20 Minutes

[Click to expand]

Generating high-fidelity surface reconstructions with only RGB inputs in 20,000 steps (around 20 minutes) is challenging, especially for sparse in-the-wild captures where occlusion and limited views make surface reconstruction an underconstrained problem. This can lead to optimization instability and difficulty converging. Introducing lidar, ToF depth, or predicted depth can help stabilize optimization and accelerate training. However, directly regularizing rendered depth is suboptimal due to bias introduced by density2sdf. Moreover, ensuring consistent depth across views is difficult, especially with lower-quality ToF sensors or predicted depth. We propose directly regularizing the SDF field using MVS point clouds and normals to alleviate the bias

Importantly, in real-world scenarios like oblique photography and virtual tours, dense point clouds are already intermediate outputs. This allows directly utilizing the existing point clouds for regularization without extra computation. In such use cases, the point cloud prior comes for free as part of the capture process.


Information you need to know before you start:

  • An aligned dense point cloud with normal is necessary, you could specify the relative path at dataset.dense_pcd_path in the config file
  • The point cloud could be generated from various methods, either from traditional MVS like colmap or OpenMVS, or learning-based MVS method. You could even generate the point cloud using commercial photogrammetry software like metashape and DJI.

Now it is time to start by running:

bash run_neuralangelo-colmap_dense.sh  ${INPUT_DIR}

Frequently asked questions (FAQ)

[Click to expand]
  1. Q: CUDA out of memory.

    A: Instant-angelo requires at least 10GB GPU memory. If you run out of memory, consider decreasing model.num_samples_per_ray from 1024 to 512

  2. Q: What's the License for this repo?

    A: This repository is built on top of instant-nsr-pl and is licensed under the MIT License. The materials, code, and assets in this repository can be used for commercial purposes without explicit permission, in accordance with the terms of the MIT License. Users are free to use, modify, and distribute this content, even for commercial applications. However, appropriate attribution to the original instant-nsr-pl authors and this repository is requested. Please refer to the LICENSE file for full terms and conditions.

  3. Q: The reconstruction of my custom dataset is bad.

    A: This repository is under active development and its robustness across diverse real-world data is still unproven. Users may encounter issues when applying the method to new datasets. Please open an issue for any problems or contact the author directly at [email protected].

  4. Q: Generate dense prior with Vis-MVSNet is slow

    A: Currently, preprocessing takes around 10~15 minutes for 300 frames, but there is still remains much room to improve efficiency by replacing Vis-MVSNet with state-of-the-art methods like MVSFormer or SimpleRecon. Moreover, preprocessing time could be substantially reduced by leveraging quantization and TensorRT. Overall, MVSNet allows generating the necessary point cloud prior an order of magnitude faster than traditional MVS approaches.

  5. Q: This project fails to run on Windows

    A: This project has not been tested on Windows and the scripts may have compatibility issues. For the best experience at this stage of development, we recommend running experiments on a Linux system. We apologize that Windows support cannot be guaranteed currently. Please feel free to open an issue detailing any problems encountered when attempting to run on Windows. Community feedback will help improve cross-platform compatibility going forward.

Related project:

  • instant-nsr-pl: Great Instant-NSR implementation in PyTorch-Lightning!
  • neuralangelo: Official implementation of Neuralangelo: High-Fidelity Neural Surface Reconstruction
  • sdfstudio: Unified Framework for SDF-based Neural Reconstruction, easy to development
  • torch-bakedsdf: Unofficial pytorch implementation of BakedSDF:Meshing Neural SDFs for Real-Time View Synthesis

Acknocklement

  • Thanks to bennyguo for his excellent pipeline instant-nsr-pl
  • Thanks to RaduAlexandru for his implementation of improved curvature loss in permuto_sdf
  • Thanks to Alex Yu for his implementation of spherical harmonics in svox2
  • Thanks for Zesong Yang and Chris for providing valuable insights and feedback that assisted development

instant-angelo's People

Contributors

hugoycj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.