Giter Site home page Giter Site logo

volograms / rgb-d-fusion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sascha-kirch/rgb-d-fusion

0.0 0.0 0.0 8.02 MB

Official implementation of the paper "RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects"

Home Page: https://ieeexplore.ieee.org/document/10239167

License: Apache License 2.0

Python 99.78% Dockerfile 0.22%

rgb-d-fusion's Introduction

🌈 RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects

Authors:
Sascha Kirch, Valeria Olyunina, Jan Ondřej, Rafael Pagés, Sergio Martín & Clara Pérez-Molina

[Paper] [BibTex]

TensorFlow implementation for RGB-D-Fusion. For details, see the paper RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects.

💡 Contribution

  • We provide a framework for high resolution dense monocular depth estimation using diffusion models.
  • We perform super-resolution for dense depth data conditioned on a multi-modal RGB-D input condition using diffusion models.
  • We introduce a novel augmentation technique, namely depth noise, to enhance the robustness of the depth super-resolution model.
  • We perform rigorous ablations and experiments to validate our design choices

🔥 News

  • 2023/10/14: Code is available Now!
  • 2023/09/04: Our paper is now published in IEEE Access!
  • 2023/07/29: We release our pre-print on arxiv.

⭐ Framework

rgb-d-fusion framework

🎖️ Results

Prediction vs. GT

results_gt

In the wild predictions

results_in_the_wild_1 results_in_the_wild_1 results_in_the_wild_1

🛠️ Installation

We reccomend using a docker environment. We provide a docker file from TensorFlow and a docker file from nvidia. The later one is larger but includes nvidia's performance optimizations. Ensure docker is installed including nvidia's GPU extension.

  1. Build the image
docker build -t <IMAGE_NAME>/<VERSION> -f <PATH_TO_DOCKERFILE>
  1. Create the container
docker container create --gpus all -u 1000:1000 --name rgb-d-fusion -p 8888:8888 -v <PATH_TO_tf_DIR>:/tf -v <PATH_TO_YOUR_GIT_DIR>:/tf/GitHub -it <IMAGE_NAME>/<VERSION>
  1. Start the container
docker start rgb-d-fusion

The directory hierachy should look as follows

|- tf
   |- manual_datasets
      |- <DATASET 1> 
         |- test
            |- DEPTH_RENDER_EXR
            |- MASK
            |- PARAM
            |- RENDER
         |- train                     # same hierachy as in test
      |- <DATASET 2>                   # same hierachy as inv_humas_rendered
   |- GitHub
      |- ConditionalDepthDiffusion    # This Repo
   |- output_runs                     # Auto generated directory to store results
      |- DepthDiffusion
         |- checkpoints               # stores saved model checkpoints
         |- illustrations             # illustrations that are beeing generated during or after training
         |- diffusion_output          # used for inference to store data sampled from the model
      |- SuperResolution              # same hierachy as in DepthDiffusion

The hierachy might be created in one place or in different directories. When starting the docker container, different directories can be mounted together.

Run Training, Evaluation and/or Inference scripts

Scripts are located under scripts. Currently there are two types of models:

  1. Depth Diffusion Model, a diffusion model that generates a depth map conditioned on an RGB image
  2. Superresolution Diffusion Model, a diffusion model that generates high resolution RGB-D from low resolution RGB-D.

Each model has it's dedicated training, eval and inference scripts written in python. You can check the functionality and parameters via python <SCRIPT> -h.

✒️ Citation

If you find our work helpful for your research, please consider citing the following BibTeX entry.

@article{kirch_rgb-d-fusion_2023,
 title = {RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects},
 author = {Kirch, Sascha and Olyunina, Valeria and Ondřej, Jan and Pagés, Rafael and Martín, Sergio and Pérez-Molina, Clara},
 journal = {IEEE Access},
 year = {2023},
 volume = {11},
 issn = {2169-3536},
 doi = {10.1109/ACCESS.2023.3312017},
 pages = {99111--99129},
 url = {https://ieeexplore.ieee.org/document/10239167},
}

rgb-d-fusion's People

Contributors

sascha-kirch avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.