Giter Site home page Giter Site logo

surrounddepth's Introduction

SurroundDepth

Project Page | Paper | Data


SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation
Yi Wei*, Linqing Zhao*, Wenzhao Zheng, Zheng Zhu, Yongming Rao, Guan Huang, Jiwen Lu, Jie Zhou

Introduction

Depth estimation from images serves as the fundamental step of 3D perception for autonomous driving and is an economical alternative to expensive depth sensors like LiDAR. The temporal photometric consistency enables self-supervised depth estimation without labels, further facilitating its application. However, most existing methods predict the depth solely based on each monocular image and ignore the correlations among multiple surrounding cameras, which are typically available for modern self-driving vehicles. In this paper, we propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.

Model Zoo

type dataset Abs Rel Sq Rel delta < 1.25 download
scale-ambiguous DDAD 0.200 3.392 0.740 model
scale-aware DDAD 0.208 3.371 0.693 model
scale-ambiguous nuScenes 0.245 3.067 0.719 model
scale-aware nuScenes 0.280 4.401 0.661 model

Install

  • python 3.8, pytorch 1.8.1, CUDA 11.4, RTX 3090
git clone https://github.com/weiyithu/SurroundDepth.git
conda create -n surrounddepth python=3.8
conda activate surrounddepth
pip install -r requirements.txt

Since we use dgp codebase to generate groundtruth depth, you should also install it.

Data Preparation

Datasets are assumed to be downloaded under data/<dataset-name>.

DDAD

  • Please download the official DDAD dataset and place them under data/ddad/raw_data. You may refer to official DDAD repository for more info and instructions.
  • Please download metadata of DDAD and place these pkl files in datasets/ddad.
  • We provide annotated self-occlusion masks for each sequences. Please download masks and place them in data/ddad/mask.
  • Export depth maps for evaluation
cd tools
python export_gt_depth_ddad.py val
  • Generate scale-aware SfM pseudo labels for pretraining (it may take several hours). Note that since we use SIFT descriptor in cv2, you need to change cv2 version and we suggest start a new conda environment.
conda create -n sift python=3.6
conda activate sift
pip install opencv-python==3.4.2.16 opencv-contrib-python==3.4.2.16
python sift_ddad.py
python match_ddad.py
  • The final data structure should be:
SurroundDepth
├── data
│   ├── ddad
│   │   │── raw_data
│   │   │   │── 000000
|   |   |   |── ...
|   |   |── depth
│   │   │   │── 000000
|   |   |   |── ...
|   |   |── match
│   │   │   │── 000000
|   |   |   |── ...
|   |   |── mask
│   │   │   │── 000000
|   |   |   |── ...

nuScenes

  • Please download the official nuScenes dataset to data/nuscenes/raw_data
  • Export depth maps for evaluation
cd tools
python export_gt_depth_nusc.py val
  • Generate scale-aware SfM pseudo labels for pretraining (it may take several hours).
conda activate sift
python sift_nusc.py
python match_nusc.py
  • The final data structure should be:
SurroundDepth
├── data
│   ├── nuscenes
│   │   │── raw_data
│   │   │   │── samples
|   |   |   |── sweeps
|   |   |   |── maps
|   |   |   |── v1.0-trainval
|   |   |── depth
│   │   │   │── samples
|   |   |── match
│   │   │   │── samples

Training

Take DDAD dataset as an example. Train scale-ambiguous model.

python -m torch.distributed.launch --nproc_per_node 8 --num_workers=8 run.py  --model_name ddad  --config configs/ddad.txt 

Train scale-aware model. First we should conduct SfM pretraining.

python -m torch.distributed.launch --nproc_per_node 8  run.py  --model_name ddad_scale_pretrain  --config configs/ddad_scale_pretrain.txt 

Then we select the best pretrained model.

python -m torch.distributed.launch --nproc_per_node 8  run.py  --model_name ddad_scale  --config configs/ddad_scale.txt  --load_weights_folder=${best pretrained}

We observe that the training on nuScenes dataset is unstable and easy to overfit. Also, the results with 4 GPUs are much better than 8 GPUs. Thus, we set fewer epochs and use 4 GPUs for nuScenes experiments. We also provide SfM pretrained model on DDAD and nuScenes.

Evaluation

python -m torch.distributed.launch --nproc_per_node ${NUM_GPU}  run.py  --model_name test  --config configs/${TYPE}.txt --models_to_load depth encoder   --load_weights_folder=${PATH}  --eval_only 

Acknowledgement

Our code is based on Monodepth2.

Citation

If you find this project useful in your research, please consider cite:

@article{wei2022surround,
    title={SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation},
    author={Wei, Yi and Zhao, Linqing and Zheng, Wenzhao and Zhu, Zheng and Rao, Yongming and Huang ,Guan and Lu, Jiwen and Zhou, Jie},
    journal={arXiv preprint arXiv:2022.6980},
    year={2022}
}

surrounddepth's People

Contributors

weiyithu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.