Giter Site home page Giter Site logo

manydepth's Introduction

Fork of manydepth for Personal purpose

conda env create -f environment.yml

If you meet the error below

NVIDIA GeForce RTX 3060 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.

run the following command

pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu114/torch_nightly.html -U

[Link to paper]

๐Ÿ‘€ Reproducing Paper Results

To recreate the results from our paper, run:

CUDA_VISIBLE_DEVICES=<your_desired_GPU> \
python -m manydepth.train \
    --data_path <your_KITTI_path> \
    --log_dir <your_save_path>  \
    --model_name <your_model_name>

Depending on the size of your GPU, you may need to set --batch_size to be lower than 12. Additionally you can train a high resolution model by adding --height 320 --width 1024.

For instructions on downloading the KITTI dataset, see Monodepth2

To train a CityScapes model, run:

CUDA_VISIBLE_DEVICES=<your_desired_GPU> \
python -m manydepth.train \
    --data_path <your_preprocessed_cityscapes_path> \
    --log_dir <your_save_path>  \
    --model_name <your_model_name> \
    --dataset cityscapes_preprocessed \
    --split cityscapes_preprocessed \
    --freeze_teacher_epoch 5 \
    --height 192 --width 512

Note here the --freeze_teacher_epoch 5 command - we found this to be important for Cityscapes models, due to the large number of images in the training set.

This assumes you have already preprocessed the CityScapes dataset using SfMLearner's prepare_train_data.py script. We used the following command:

python prepare_train_data.py \
    --img_height 512 \
    --img_width 1024 \
    --dataset_dir <path_to_downloaded_cityscapes_data> \
    --dataset_name cityscapes \
    --dump_root <your_preprocessed_cityscapes_path> \
    --seq_length 3 \
    --num_threads 8

Note that while we use the --img_height 512 flag, the prepare_train_data.py script will save images which are 1024x384 as it also crops off the bottom portion of the image. You could probably save disk space without a loss of accuracy by preprocessing with --img_height 256 --img_width 512 (to create 512x192 images), but this isn't what we did for our experiments.

๐Ÿ’พ Pretrained weights and evaluation

You can download weights for some pretrained models here:

To evaluate a model on KITTI, run:

CUDA_VISIBLE_DEVICES=<your_desired_GPU> \
python -m manydepth.evaluate_depth \
    --data_path <your_KITTI_path> \
    --load_weights_folder <your_model_path>
    --eval_mono

Make sure you have first run export_gt_depth.py to extract ground truth files.

And to evaluate a model on Cityscapes, run:

CUDA_VISIBLE_DEVICES=<your_desired_GPU> \
python -m manydepth.evaluate_depth \
    --data_path <your_cityscapes_path> \
    --load_weights_folder <your_model_path>
    --eval_mono \
    --eval_split cityscapes

During evaluation, we crop and evaluate on the middle 50% of the images.

We provide ground truth depth files HERE, which were converted from pixel disparities using intrinsics and the known baseline. Download this and unzip into splits/cityscapes.

If you want to evaluate a teacher network (i.e. the monocular network used for consistency loss), then add the flag --eval_teacher. This will load the weights of mono_encoder.pth and mono_depth.pth, which are provided for our KITTI models.

๐Ÿ–ผ Running on your own images

We provide some sample code in test_simple.py which demonstrates multi-frame inference. This predicts depth for a sequence of two images cropped from a dashcam video. Prediction also requires an estimate of the intrinsics matrix, in json format. For the provided test images, we have estimated the intrinsics to be equivalent to those of the KITTI dataset. Note that the intrinsics provided in the json file are expected to be in normalised coordinates.

Download and unzip model weights from one of the links above, and then run the following command:

python -m manydepth.test_simple \
    --target_image_path assets/test_sequence_target.jpg \
    --source_image_path assets/test_sequence_source.jpg \
    --intrinsics_json_path assets/test_sequence_intrinsics.json \
    --model_path path/to/weights

A predicted depth map rendering will be saved to assets/test_sequence_target_disp.jpeg.

๐Ÿ‘ฉโ€โš–๏ธ License

Copyright ยฉ Niantic, Inc. 2021. Patent Pending. All rights reserved. Please see the license file for terms.

manydepth's People

Contributors

jamiewatson683 avatar mdfirman avatar daniyar-niantic avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.