Real-Time and High-Accuracy Switchable Stereo Depth Estimation Method Utilizing Self-Supervised Online Learning Mechanism for MIS.
Jieyu Zheng, Xiaojian Li*, Xin Wang*, Haojun Wu, Ling Li, Xiang Ma, Shanlin Yang
To pretrain the network SSDNet, you will need to download the required datasets.
- Sceneflow (DispNet/FlowNet2.0 dataset subsets are enough)
The structure of the downloaded dataset is as follows
โโโ SceneFlowSubData
โโโ disparity_occlusions
โโโ train
โโโ val
โโโ frames_cleanpass
โโโ train
โโโ val
โโโ frames_disparity
โโโ train
โโโ val
To validate the MIOL framework, you will need to create your own test data structured as follows
โโโ data
โโโ test_02
โโโ 00000.jpg
โโโ 00001.jpg
โโโ ...
โโโ 00600.jpg
โโโ calib.yaml
โโโ test_03
โโโ 00000.jpg
โโโ 00001.jpg
โโโ ...
โโโ 00600.jpg
We recommend using Anaconda to set up an environment
cd MIOL
conda create -n pytorch python=3.7
conda activate pytorch
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt
Please modify the torch version based on your GPU. We managed to test our code on:
- Ubuntu 18.04/20.04 with Python 3.7 and CUDA 11.1.
- Windows 10/11 with Python 3.7 and CUDA 11.7.
Pretrained models can be downloaded from OneDrive.
Our model is trained on one RTX-3090 GPU using the following command. Training logs will be written to checkpoints/log_name
which can be visualized using tensorboard.
python train_STM_sceneflow_meta.py --meta-batch-size 4 -k 1 -q 1 --inner-lr 1e-4 --meta-lr 1e-4 --epochs 25 --data path_to_SceneFlowSubData/ --name log_name
You can demo the trained model on a sequence of stereo images. To predict depth for your dataset, run
python test_inference.py --pretrained-model sceneflow_pretrained.tar --calib-path path_to_calib_yaml --dataset-dir path_to_test_02 --output-dir output --output-depth
To save the depth values as .npy
files and the depth maps as .png
images, run with the --output-depth
flag.
Here we show some results of the proposed MIOL framework on Hamlyn and SCARED datasets.
1.mp4
Our code is based on the excellent works of SC-SfMLearner and monodepth2.
For academic usage, the code is released under the permissive MIT license. Our intension of sharing the project is for research/personal purpose. For any commercial purpose, please contact the authors.
If you find this code useful for your research, please use the following BibTeX entries:
@article{zheng2024MIOL,
author={Zheng, Jieyu and Li, Xiaojian and Wang, Xin and Wu, Haojun and Li, Ling and Ma, Xiang and Yang, Shanlin},
journal={IEEE Transactions on Instrumentation and Measurement},
title={Real-Time and High-Accuracy Switchable Stereo Depth Estimation Method Utilizing Self-Supervised Online Learning Mechanism for MIS},
year={2024},
volume={73},
pages={1-13}
}