Giter Site home page Giter Site logo

ukcheolshin / thermalmonodepth Goto Github PK

View Code? Open in Web Editor NEW
26.0 2.0 4.0 6.44 MB

Official implementation of the paper "Maximizing Self-supervision from Thermal Image for Effective Self-supervised Learning of Depth and Ego-motion"

Python 93.60% Shell 6.40%
deep-learning depth-estimation infrared-sensors monocular-depth pose-estimation self-supervised-learning thermal-camera thermal-images unsupervised-learning

thermalmonodepth's Introduction

Maximizing Self-supervision from Thermal Image for Effective Self-supervised Learning of Depth and Ego-motion

This github is a official implementation of the paper:

Maximizing Self-supervision from Thermal Image for Effective Self-supervised Learning of Depth and Ego-motion

Ukcheol Shin, Kyunghyun Lee, Byeong-Uk Lee, In So Kweon

Robotics and Automation Letter 2022 & IROS 2022

[PDF] [Project webpage] [Full paper] [Youtube]

Introduction

Recently, self-supervised learning of depth and ego-motion from thermal images shows strong robustness and reliability under challenging lighting and weather conditions. However, the inherent thermal image properties such as weak contrast, blurry edges, and noise hinder to generate effective self-supervision from thermal images. Therefore, most previous researches just rely on additional self-supervisory sources such as RGB video, generative models, and Lidar information. In this paper, we conduct an in-depth analysis of thermal image characteristics that degenerates self-supervision from thermal images. Based on the analysis, we propose an effective thermal image mapping method that significantly increases image information, such as overall structure, contrast, and details, while preserving temporal consistency. By resolving the fundamental problem of the thermal image, our depth and pose network trained only with thermal images achieves state-of-the-art results without utilizing any extra self-supervisory source. As our best knowledge, this work is the first self-supervised learning approach to train monocular depth and relative pose networks with only thermal images.

Please refer to the video for more descriptions and visual results.

Video Label

Main Results

Depth Results

Indoor test set (Well-lit)

Models Abs Rel Sq Rel RMSE RMSE(log) Acc.1 Acc.2 Acc.3
Shin(T) 0.225 0.201 0.709 0.262 0.620 0.920 0.993
Shin(MS) 0.156 0.111 0.527 0.197 0.783 0.975 0.997
Ours 0.152 0.121 0.538 0.196 0.814 0.965 0.992

Indoor test set (Low-/Zero- light)

Models Abs Rel Sq Rel RMSE RMSE(log) Acc.1 Acc.2 Acc.3
Shin(T) 0.232 0.222 0.740 0.268 0.618 0.907 0.987
Shin(MS) 0.166 0.129 0.566 0.207 0.768 0.967 0.994
Ours 0.149 0.109 0.517 0.192 0.813 0.969 0.994

Outdoor test set (Night-time)

Models Abs Rel Sq Rel RMSE RMSE(log) Acc.1 Acc.2 Acc.3
Shin(T) 0.157 1.179 5.802 0.211 0.750 0.948 0.985
Shin(MS) 0.146 0.873 4.697 0.184 0.801 0.973 0.993
Ours 0.109 0.703 4.132 0.150 0.887 0.980 0.994

Pose Estimation Results

Indoor-static-dark

Metric ATE RE
Shin(T) 0.0063 0.0092
Shin(MS) 0.0057 0.0089
Ours 0.0059 0.0082

Outdoor-night1

Metric ATE RE
Shin(T) 0.0571 0.0280
Shin(MS) 0.0562 0.0287
Ours 0.0546 0.0287

Getting Started

Prerequisite

This codebase was developed and tested with python 3.7, Pytorch 1.5.1, and CUDA 10.2 on Ubuntu 16.04.

conda env create --file environment.yml

Pre-trained Model

Our pre-trained models are availabe in this link

Datasets

For ViViD Raw dataset, download the dataset provided on the official website.

For our post-processed dataset, please refer to this Github page.

After download our post-processed dataset, unzip the files to form the below structure.

Expected dataset structure for the post-processed ViViD dataset:

KAIST_VIVID/
  calibration/
    cali_ther_to_rgb.yaml, ...
  indoor_aggressive_local/
    RGB/
      data/
        000001.png, 000002.png, ...
      timestamps.txt
    Thermal/
      data/
      timestamps.txt
    Lidar/
      data/
      timestamps.txt
    Warped_Depth/
      data/
      timestamps.txt
    avg_velocity_thermal.txt
    poses_thermal.txt
    ...
  indoor_aggressive_global/
    ...	
  outdoor_robust_day1/
    ...
  outdoor_robust_night1/
    ...

Upon the above dataset structure, you can generate training/testing dataset by running the script.

sh scripts/prepare_vivid_data.sh

Training

The "scripts" folder provides several examples for training, testing, and visualization.

You can train the depth and pose model on vivid dataset by running

sh scripts/train_vivid_resnet18_indoor.sh
sh scripts/train_vivid_resnet18_outdoor.sh

Then you can start a tensorboard session in this folder by

tensorboard --logdir=checkpoints/

and visualize the training progress by opening https://localhost:6006 on your browser.

Evaluation

You can evaluate depth and pose by running

bash scripts/test_vivid_indoor.sh
bash scripts/test_vivid_outdoor.sh

and visualize depth by running

bash scripts/run_vivid_inference.sh

You can comprehensively see the overall results by running

bash scripts/display_result.sh

Citation

Please cite the following paper if you use our work, parts of this code, and pre-processed dataset in your research.

@ARTICLE{shin2022maximize,  
	author={Shin, Ukcheol and Lee, Kyunghyun and Lee, Byeong-Uk and Kweon, In So},  
	journal={IEEE Robotics and Automation Letters},   
	title={Maximizing Self-Supervision From Thermal Image for Effective Self-Supervised Learning of Depth and Ego-Motion},   
	year={2022},  
	volume={7},  
	number={3},  
	pages={7771-7778},  
	doi={10.1109/LRA.2022.3185382}
}	

Related projects

thermalmonodepth's People

Contributors

ukcheolshin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

thermalmonodepth's Issues

code release

Great job!
When will the code released?
thank you

About dataset structure

The dataset structure you mentioned in README contains some files that are not available in ViViD++Rosbag. For example, cali_ ther_ to_ rgb.yaml,avg_ velocity_ thermal.txt,poses_ thermal.txt. How can I get these documents?

About Evaluation

Hello

I've seen this interesting research very well.

I'm running the evaluation code and checking the performance, but I don't know what kind of performance is correct to compare with the reported performance.

It is reported that the performance of the Indoor test set (Well-lit) is as follows.

Ours | 0.152 | 0.121 | 0.538 | 0.196 | 0.814 | 0.965 | 0.992

However, when the performance is measured using the uploaded pre-train model, the following performance is shown for the
"indoor robust_varing_well_it" folder.

==> Evaluating depth result...
 Scaling ratios | med: 4.776 | std: 0.056
 Scaling ratios | mean: 4.764 +- std: 0.268

   abs_rel |   sq_rel |     rmse | rmse_log |       a1 |       a2 |       a3 | 
&   0.129  &   0.090  &   0.445  &   0.171  &   0.879  &   0.981  &   0.996  \\

Is it correct to look at the performance of the folder above to see the correct performance?
And in which folder should the "Indoor test set (Low-/Zero-light)" see the performance?

Is this a bug?

in common/data_prepare/prepare_train_data_VIVID.py
2022-07-24 16-17-55 的屏幕截图

For 'indoor_robust_varying_well_lit' sequence, why 'RGB' folder contents and 'Thermal' folder contents are exchanged? Is this a bug?

About hyperparameter settings

Hello
your work is very inspiring.Thank you for sharing!

I found that the learning rate is set to 1e-6 in the paper, but the default parameter of the provided code is set to 1e-4. Similarly, the weight of the smoothing loss is set to 0.1 in the paper, but the default parameter of the provided code is set to 0.01.

Which parameter should be used for better training results?

Dataset preparation

Hi,

I noticed that in the common/data_prepare/VIVID_raw_loader.py the validation dataset is defined as part of the test dataset:

self.indoor_train_list = ['indoor_aggresive_global', 'indoor_unstable_local', 'indoor_robust_global', 'indoor_robust_local', 'indoor_unstable_global']
self.indoor_val_list = ['indoor_robust_dark', 'indoor_aggresive_local']
self.indoor_test_list = ['indoor_robust_dark', 'indoor_robust_varying', 'indoor_aggresive_dark', 'indoor_unstable_dark', 'indoor_aggresive_local']
self.outdoor_train_list = ['outdoor_robust_day1', 'outdoor_robust_day2']
self.outdoor_val_list = ['outdoor_robust_night1']
self.outdoor_test_list = ['outdoor_robust_night1', 'outdoor_robust_night2']

Could you please tell me the reason?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.