Giter Site home page Giter Site logo

miol's Introduction

Offical Code Implementation for MIOL in 3D laparoscopy

Real-Time and High-Accuracy Switchable Stereo Depth Estimation Method Utilizing Self-Supervised Online Learning Mechanism for MIS.
Jieyu Zheng, Xiaojian Li*, Xin Wang*, Haojun Wu, Ling Li, Xiang Ma, Shanlin Yang

Github Repository Paper

๐Ÿ’พ Data Description

To pretrain the network SSDNet, you will need to download the required datasets.

  • Sceneflow (DispNet/FlowNet2.0 dataset subsets are enough)

The structure of the downloaded dataset is as follows

โ”œโ”€โ”€ SceneFlowSubData
    โ”œโ”€โ”€ disparity_occlusions
        โ”œโ”€โ”€ train
        โ”œโ”€โ”€ val
    โ”œโ”€โ”€ frames_cleanpass
        โ”œโ”€โ”€ train
        โ”œโ”€โ”€ val
    โ”œโ”€โ”€ frames_disparity
        โ”œโ”€โ”€ train
        โ”œโ”€โ”€ val

To validate the MIOL framework, you will need to create your own test data structured as follows

โ”œโ”€โ”€ data
    โ”œโ”€โ”€ test_02
        โ”œโ”€โ”€ 00000.jpg
        โ”œโ”€โ”€ 00001.jpg
        โ”œโ”€โ”€ ...
        โ”œโ”€โ”€ 00600.jpg
        โ”œโ”€โ”€ calib.yaml
    โ”œโ”€โ”€ test_03
        โ”œโ”€โ”€ 00000.jpg
        โ”œโ”€โ”€ 00001.jpg
        โ”œโ”€โ”€ ...
        โ”œโ”€โ”€ 00600.jpg

โ˜€๏ธ Setup

We recommend using Anaconda to set up an environment

cd MIOL
conda create -n pytorch python=3.7
conda activate pytorch
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu117
pip install -r requirements.txt

Please modify the torch version based on your GPU. We managed to test our code on:

  • Ubuntu 18.04/20.04 with Python 3.7 and CUDA 11.1.
  • Windows 10/11 with Python 3.7 and CUDA 11.7.

๐Ÿšด Training

Pretrained models can be downloaded from OneDrive.

Our model is trained on one RTX-3090 GPU using the following command. Training logs will be written to checkpoints/log_name which can be visualized using tensorboard.

python train_STM_sceneflow_meta.py --meta-batch-size 4 -k 1 -q 1 --inner-lr 1e-4 --meta-lr 1e-4 --epochs 25 --data path_to_SceneFlowSubData/ --name log_name

๐ŸŽฅ Demos

You can demo the trained model on a sequence of stereo images. To predict depth for your dataset, run

python test_inference.py --pretrained-model sceneflow_pretrained.tar --calib-path path_to_calib_yaml --dataset-dir path_to_test_02 --output-dir output --output-depth

To save the depth values as .npy files and the depth maps as .png images, run with the --output-depth flag.

๐Ÿฌ Visualization

Here we show some results of the proposed MIOL framework on Hamlyn and SCARED datasets.

  • Depth map predictions imgs

  • Point Clouds

1.mp4

๐ŸŒน Acknowledgment

Our code is based on the excellent works of SC-SfMLearner and monodepth2.

License

For academic usage, the code is released under the permissive MIT license. Our intension of sharing the project is for research/personal purpose. For any commercial purpose, please contact the authors.

Citation

If you find this code useful for your research, please use the following BibTeX entries:

@article{zheng2024MIOL,
  author={Zheng, Jieyu and Li, Xiaojian and Wang, Xin and Wu, Haojun and Li, Ling and Ma, Xiang and Yang, Shanlin},
  journal={IEEE Transactions on Instrumentation and Measurement}, 
  title={Real-Time and High-Accuracy Switchable Stereo Depth Estimation Method Utilizing Self-Supervised Online Learning Mechanism for MIS}, 
  year={2024},
  volume={73},
  pages={1-13}
}

miol's People

Contributors

darcy-vision avatar

Stargazers

 avatar Milton Xu avatar hfut avatar Darcy avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.