Giter Site home page Giter Site logo

tcmonodepth's Introduction

TCMonoDepth: Enforcing Temporal Consistency in Video Depth Estimation

TCMonoDepth is a method for stable depth estimation for any video.

TCMonoDepth 是一个为任意视频估计稳定的深度值的模型。

Paper

Usage

Requirements

  • python
  • pytorch
  • torchvision
  • opencv
  • tqdm

Testing

You can download our pretraind checkppont from link (google drive) or link (百度云, 提取码: w2kr) and save it in the./weights folder. Put your video into the folder videos and run

cd TCMonoDepth
python demo.py --model large --resume ./weights/_ckpt.pt.tar --input ./videos --output ./output --resize_size 384

A small MonoDepth model for mobile devices

A lightweight and very fast monodepth model

cd TCMonoDepth
python demo.py --model small --resume ./weights/_ckpt_small.pt.tar --input ./videos --output ./output --resize_size 256

Bibtex

If you use this code for your research, please consider to star this repo and cite our paper.

@inproceedings{li2021enforcing,
 title={Enforcing Temporal Consistency in Video Depth Estimation},
 author={Li, Siyuan and Luo, Yue and Zhu, Ye and Zhao, Xun and Li, Yu and Shan, Ying},
 booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops},
 year={2021}
}

Acknowledgement

In this project, parts of the code are adapted from: MiDaS. We thank the authors for sharing codes for their great works.

tcmonodepth's People

Contributors

yu-li avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

tcmonodepth's Issues

Request for Implementation details

Could you please elaborate what exactly your SILoss is? Your reference MiDAS has many SILosses. Which exact one do you use?

Could you please also give other implementation details, including the hyperparameters, such as the value of σ in your soft occlusion mask, λ: the weight of the temporal consistency term, δ: threshold for color differences, etc..

I would also be interested to know if you normalize or scale your depth predictions by any scaling factor at training or evaluation

Thank you

Temporal filtering

Hi, is the temporal filtering done on the pretrained model itself or in demo.py? I thought Midasnet is not temporally consistent. Would you mind pointing out how you did the temporal filtering?

Metric of Temporal Consistency

Hi,thank you for sharing. It's a great job. I have read your paper 'Enforcing Temporal Consistency in Video Depth Estimation'. However, I have a question. I wonder what's the appropriate value of the threshold 'thr' defined in formula (2) of your paper? Could you give me some hints? Thanks!

Assertion Error in Colab

I am not sure if this is currently a torch issue but would anybody happen to know how to fix this.

/content/TCMonoDepth
Run Video Depth Sample
Initialize
Device: cuda
Creating model...
model size is 0.5x
Loading model from /content/TCMonoDepth/weights/_ckpt_small.pt.tar
Loading model done...
<VideoCapture 0x7f581ca43fd0>
Error opening video stream or file
Traceback (most recent call last):
File "processVideoFile.py", line 187, in
run(args)
File "processVideoFile.py", line 155, in run
write_video(outputfile, color_list, fps)
File "processVideoFile.py", line 22, in write_video
assert (len(output_list) > 0)
AssertionError

Onnx model

Is there any plans to release onnx model?

TypeError: 'int' object is not subscriptable

Upon running
python demo.py --model large --resume ./weights/_ckpt.pt.tar --input ./videos --output ./output --resize_size 384
got an error:
Output & Stacktrace:

F:\Depth_estimation\TCMonoDepth-main>python demo.py --model large --resume ./weights/_ckpt.pt.tar --input ./videos --output ./output --resize_size 384
Run Video Depth Sample
Initialize
Device: cuda
Creating model...
Loading model from ./weights/_ckpt.pt.tar
Loading model done...
Traceback (most recent call last):
  File "demo.py", line 148, in <module>
    run(args)
  File "demo.py", line 76, in run
    args.resize_size[0],  #width
TypeError: 'int' object is not subscriptable

paper link

Hi, good job! Please share paper`s download link,beasuse I dont find paper

valid mask

Hi,thank you for sharing. It's a great job.After read your paper,I have some question.
(a) what's the appropriate value of the threshold 'σ' defined Mi = exp(−σ · (||X i −X i+1 ||2) (Eqn. 5)?
(b) the value of Di and ˆDi+1 in Eqn. 2 is the output of the network or the normalized of the network?(Di是网络直接的输出结果 还是 网络输出的然后归一化的结果)

Training Code

Do you plan to release your training code sometime in the future? It would be really helpful to advance the research on monocular depth estimation!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.