Giter Site home page Giter Site logo

songchunzhang / dynamo-depth Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yihongsun/dynamo-depth

0.0 0.0 0.0 18.69 MB

[NeurIPS 2023] Dynamo-Depth: Fixing Unsupervised Depth Estimation for Dynamical Scenes

Home Page: https://dynamo-depth.github.io

License: MIT License

Python 20.87% Jupyter Notebook 79.13%

dynamo-depth's Introduction

Dynamo-Depth: Fixing Unsupervised Depth Estimation for Dynamical Scenes

Official PyTorch implementation for the NeurIPS 2023 paper: "Dynamo-Depth: Fixing Unsupervised Depth Estimation for Dynamical Scenes".

License: MIT

Table of Contents

Installation

The code is tested with python=3.7, torch==1.12.1+cu102 and torchvision==0.13.1+cu102 on four RTX 2080 Ti GPUs.

git clone --recurse-submodules https://github.com/YihongSun/Dynamo-Depth/
cd Dynamo-Depth/
conda create -n dynamo python=3.7
conda activate dynamo
pip install torch==1.12.1 torchvision==0.13.1
pip install matplotlib wandb opencv-python tqdm gdown scikit-image timm==0.6.13
pip install imageio==2.19.3
pip install imageio-ffmpeg==0.4.7

Quick Demo

Please run quick-demo.ipynb for a quick example inference (This can be used directly after installations).

Data Preparation

Please refer to the preprocessing instructions for preparing training data for KITTI, Waymo Open, or nuScenes Dataset.

Training

Training can be done with a single GPU or multiple GPUs (via torch.nn.parallel.DistributedDataParallel)

The following are a set of shared arguments to use with any training method.

  • -n <EXP_NAME> indicates the name of the experiment.
  • -d <DATASET_NAME> specifies which dataset ("waymo", "nuscenes", or "kitti") to train on, and the default is "waymo".
  • -l </PATH/TO/MODEL/CKPT> indicates which model checkpoint to be load before training.
  • --depth_model <MODEL_NAME> specifies which depth model ("litemono" or "monodepthv2") to train, with default "litemono".

⏳ Single GPU Training

For instance, to train w/ 1 GPU on Waymo Dataset from scratch:

python3 train.py -d waymo -n waymo_example_run 

⏳ Multi-GPU Training

For instance, to train w/ 4 GPUs on Waymo Dataset

python -m torch.distributed.launch --nproc_per_node=4 train.py --cuda_ids 0 1 2 3 -d waymo -n waymo_example_run_parallel 

Note: All experiments are ran with 4 RTX 2080 Ti GPUs via torch.nn.parallel.DistributedDataParallel. Learning rate and scheduler step size should be adjusted accordingly when training with a single GPU (See options.py for details).

Evaluation

Scripts for evaluation are found in eval/, including depth, motion segmentation, odometry, and visualization.

The following are a set of shared arguments to use with any of the evaluation scripts above.

  • -l </PATH/TO/MODEL/CKPT> indicates which model checkpoint to be evaluated.
  • --depth_model <MODEL_NAME> specifies which depth model ("litemono" or "monodepthv2") to use, with default "litemono".
  • -d <DATASET_NAME> specifies which dataset ("waymo", "nuscenes", or "kitti") to evaluate on, and the default is "waymo".
  • --eval_dir defines the output directory where the results would be saved, with default "./outputs".

Note: To access the trained models for Waymo Open, please send us an email with your name, institute, a screenshot of the the Waymo dataset registration confirmation mail, and your intended usage. Please send a second email if we don't get back to you in two days. Please note that Waymo open dataset is under strict non-commercial license so we are not allowed to share the model with you if it will used for any profit-oriented activities.

📊 Depth

eval/depth.py evaluates monocular depth estimation, with results saved in ./outputs/<CKPT>_<DATASET>/depth/.

🔹 To replicate the results reported in the paper (Table 1 and 2), run the following lines.

## === Missing checkpoints will be downloaded automatically === ##

python3 eval/depth.py -l ckpt/W_Dynamo-Depth                                  ## please reach out for ckpt!!
python3 eval/depth.py -l ckpt/W_Dynamo-Depth_MD2 --depth_model monodepthv2    ## please reach out for ckpt!!
python3 eval/depth.py -l ckpt/N_Dynamo-Depth -d nuscenes
python3 eval/depth.py -l ckpt/N_Dynamo-Depth_MD2 --depth_model monodepthv2 -d nuscenes
python3 eval/depth.py -l ckpt/K_Dynamo-Depth -d kitti
python3 eval/depth.py -l ckpt/K_Dynamo-Depth_MD2 --depth_model monodepthv2 -d kitti
Model Dataset Abs Rel Sq Rel RMSE RMSE log delta < 1.25 delta < 1.252 delta < 1.253
K_Dynamo-Depth_MD2 KITTI 0.120 0.864 4.850 0.195 0.858 0.956 0.982
K_Dynamo-Depth(*) KITTI 0.112 0.768 4.528 0.184 0.874 0.961 0.984
N_Dynamo-Depth_MD2 nuScenes 0.193 2.285 7.357 0.287 0.765 0.885 0.935
N_Dynamo-Depth nuScenes 0.179 2.118 7.050 0.271 0.787 0.896 0.940
W_Dynamo-Depth_MD2(†) Waymo 0.130 1.439 6.646 0.183 0.851 0.959 0.985
W_Dynamo-Depth(†) Waymo 0.116 1.156 6.000 0.166 0.878 0.969 0.989

(*) Very minor differences compared to the results in the paper. Rest of the checkpoints are consistent with the paper.
(†) Please refer to the note above for obtaining access to the models trained on Waymo Open Dataset.

🔹 To replicate the results reported in the Appendix (Table 6 and 7), run the following lines.

## === Missing checkpoints will be downloaded automatically === ##

python3 eval/depth.py -l ckpt/N_Dynamo-Depth -d nuscenes --split nuscenes_dayclear
python3 eval/depth.py -l ckpt/N_Dynamo-Depth_MD2 --depth_model monodepthv2 -d nuscenes --split nuscenes_dayclear

Note that by adding --split nuscenes_dayclear, we evaluate on the nuScenes day-clear subset as defined in splits/nuscenes_dayclear/test_files.txt instead of the original splits/nuscenes/test_files.txt

📊 Motion Segmentation

eval/motion_segmentation.py evaluates binary motion segmentation, with results saved in ./outputs/<CKPT>_<DATASET>/mot_seg/.

🔹 To replicate the results reported in the paper (Figure 4 and 8), run the following line.

## === Missing checkpoints will be downloaded automatically === ##

python3 eval/motion_segmentation.py -l ckpt/W_Dynamo-Depth                         ## please reach out for ckpt!!
python3 eval/motion_segmentation.py -l ckpt/N_Dynamo-Depth -d nuscenes --split nuscenes_dayclear

📊 Odometry

eval/odometry.py evaluates odometry, with results saved in ./outputs/<CKPT>_<DATASET>/odometry/.

🔹 To replicate the results reported in the Appendix (Table 8), run the following line.

## === Missing checkpoints will be downloaded automatically === ##

python3 eval/odometry.py -l ckpt/W_Dynamo-Depth                                    ## please reach out for ckpt!!                                  
python3 eval/odometry.py -l ckpt/W_Dynamo-Depth_MD2 --depth_model monodepthv2      ## please reach out for ckpt!!     
python3 eval/odometry.py -l ckpt/N_Dynamo-Depth -d nuscenes --split nuscenes_dayclear
python3 eval/odometry.py -l ckpt/N_Dynamo-Depth_MD2 --depth_model monodepthv2 -d nuscenes --split nuscenes_dayclear

🖼️ Visualization

eval/visualize.py visualize model performances, with results saved in ./outputs/<CKPT>_<DATASET>/vis/.

🔹 To generate the Qualitative Results in the Project Page, run the following line.

## === Missing checkpoints will be downloaded automatically === ##

python3 eval/visualize.py -l ckpt/W_Dynamo-Depth                                   ## please reach out for ckpt!!     
python3 eval/visualize.py -l ckpt/N_Dynamo-Depth -d nuscenes

Citation

If you find our work useful in your research, please consider citing our paper:

@inproceedings{sun2023dynamodepth,
  title={Dynamo-Depth: Fixing Unsupervised Depth Estimation for Dynamical Scenes},
  author={Yihong Sun and Bharath Hariharan},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
  year={2023}
}

dynamo-depth's People

Contributors

yihongsun avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.