raymondwang987 / nvds Goto Github PK
View Code? Open in Web Editor NEWThe official repository of the ICCV2023 paper "Neural Video Depth Stabilizer" (NVDS).
License: MIT License
The official repository of the ICCV2023 paper "Neural Video Depth Stabilizer" (NVDS).
License: MIT License
Did anyone try to run NVDS in Windows? I suspect it is not that easy because of GMFlow...
I have looked at other issues and running NVDS successfully seems to be a challenge at the moment. I am stating my steps here in hope that someone can advise me to make corrections if I am making any mistakes.
I am using a conda environment on a Ubuntu 18.04.6 LTS
Here are my steps:
conda create --name NVDS_attempt4 python=3.8
conda activate NVDS_attempt4
conda install pytorch torchvision -c pytorch
Following the author's advised strict instructions for mmcv/mmseg:
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
pip install "mmsegmentation>=1.0.0"
At this point, I tried running the code using following instruction:
CUDA_VISIBLE_DEVICES=0 python infer_NVDS_midas_bi.py --base_dir ./demo_outputs/midas_init/market_6/ --vnum market_6 --infer_w 896 --infer_h 384
It complained about timm being missing, which I installed next:
pip install timm
And now, I see the following issue.
"ImportError: /home/touqeera/mambaforge/envs/NVDS_attempt4/lib/python3.8/site-packages/mmcv/_ext.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs"
I have tried several times and have not been successful so far in setting up the environment to NVDS' liking. Please advise.
Thanks for your great work! I wonder if I could use your pretrained model to test my video data directly, or should I finetune the model using the video data first, if I want to get the right results.
Hey,
Seems like your code is referencing a mit_b5 checkpoint from your own computer.
File "/home/pablo/miniconda3/envs/NVDS/lib/python3.8/site-packages/mmcv/runner/checkpoint.py", line 260, in load_from_local
raise IOError(f'{filename} is not a checkpoint file')
OSError: /data/wangyiran/work2/mymodels/mit_pretrain/mit_b5.pth is not a checkpoint file
This comes from the model def at full_model.py
self.backbone.init_weights('/data/wangyiran/work2/mymodels/mit_pretrain/mit_b5.pth')
Where can I find it? Thanks.
Dear NVDS authors,
Thanks for publishing this great work! I run the DE with Midas and then create a sliding window of 4 RGBD samples to get the output of NVDS. Seeing the below results for the first window (containing 3 copies + the image & depth from midas):
What do you think I am doing wrong?
Hi, very interesting work!
I also have a question related to temporal loss (Eq.4 in the main paper): the predicted disparity D and RGB images are (0~255) or normalized to (0,1)? Plus, the temporal loss is calculated within the neighboring two frames, right? Have you tried to extend this to multiple frames?
Thank you for your excellent work. I tried your work on our surgical dataset and found that the final mix result was not as good as the initial result. How can I fine-tune the Stablization Network?
In my desired application, I need a depth method that outputs relative depth across frames. I am aiming to produce metric depth in a video by fitting the relative depth to a small subset of pixels with known "true depth" values. I would like to fit a single scale and shift parameter across multiple frames (probably the context length of NVSD). Does NVSD achieve consistent scale/shift across frames? I.E. If I compared NVSD outputs to a true depth video (where NVSD outputs were evaluated upto a scale and shift in inverse depth), would the best-fit scale/shift parameter for each frame be roughly the same?
Great work! I'm wondering whether the work can also predict the consistent depth of egocentric video.
Thanks for your good job. I tried to finetune(or retrain) the nvds on other datasets like tartanair and mipi. I use the direct real depth as the input and find that the test result has many artifacts especially on the ground. I also use the predifined max depth(like 20 meter) to normalize the inputs but dont work. Are the inputs for nvds has to be relative inverse depths, or there is some tricks or attentions i need to focus. Thanks and look forwards to your replies!
Thanks for the amazing work and releasing the code!
Are you planning to release the checkpoints and/or test script for the NYUDv2 dataset, so that I can reproduce the numbers in the paper?
A Great Work !
Is there any plan to release training code ?
Hi there, I'm trying to run infer_NVDS_dpt_bi.py
on some images with (NVDS) ~/repo/NVDS python infer_NVDS_dpt_bi.py --base_dir ./demo_outputs/dpt_init/000423/ --vnum 000423 --infer_w 896 --infer_h 384
using an M2 Macbook Pro (I changed the device
to use cpu
).
When I run, the normal depth network results look okay, i.e. demo_outputs/dpt_init/000423/initial/gray/frame_000000.png
:
But the results of the NVDS
forward pass are all black, i.e. demo_outputs/dpt_init/000423/1/gray/frame_000000.png
:
and demo_outputs/dpt_init/000423/1/color/0.png
:
These are my installed packages:
(NVDS) ~/repo/NVDS conda list ✹ ✭main
# packages in environment at /opt/homebrew/anaconda3/envs/NVDS:
#
# Name Version Build Channel
absl-py 2.1.0 pypi_0 pypi
appnope 0.1.2 py38hca03da5_1001
asttokens 2.0.5 pyhd3eb1b0_0
attr 0.3.2 pypi_0 pypi
backcall 0.2.0 pyhd3eb1b0_0
blas 1.0 openblas
brotli-python 1.0.9 py38hc377ac9_7
ca-certificates 2023.12.12 hca03da5_0
cachetools 5.3.2 pypi_0 pypi
certifi 2023.11.17 py38hca03da5_0
cffi 1.16.0 py38h80987f9_0
chardet 5.2.0 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
comm 0.1.2 py38hca03da5_0
contourpy 1.1.1 pypi_0 pypi
cryptography 41.0.3 py38h3c57c4d_0
cycler 0.12.1 pypi_0 pypi
debugpy 1.6.7 py38h313beb8_0
decorator 5.1.1 pyhd3eb1b0_0
einops 0.7.0 pypi_0 pypi
entrypoints 0.4 pyhd8ed1ab_0 conda-forge
executing 0.8.3 pyhd3eb1b0_0
filelock 3.13.1 py38hca03da5_0
fonttools 4.47.2 pypi_0 pypi
freetype 2.12.1 h1192e45_0
fsspec 2023.12.2 pypi_0 pypi
future 0.18.3 pyhd8ed1ab_0 conda-forge
giflib 5.2.1 h80987f9_3
gmp 6.2.1 hc377ac9_3
gmpy2 2.1.2 py38h8c48613_0
google-auth 2.26.2 pypi_0 pypi
google-auth-oauthlib 1.0.0 pypi_0 pypi
grpcio 1.60.0 pypi_0 pypi
h5py 3.10.0 pypi_0 pypi
huggingface-hub 0.20.3 pypi_0 pypi
idna 3.6 pypi_0 pypi
imageio 2.33.1 pypi_0 pypi
importlib-metadata 7.0.1 py38hca03da5_0
importlib-resources 6.1.1 pypi_0 pypi
importlib_metadata 7.0.1 hd3eb1b0_0
ipykernel 6.29.0 pyh3cd1d5f_0 conda-forge
ipython 8.12.3 pypi_0 pypi
jedi 0.18.1 py38hca03da5_1
jinja2 3.1.2 py38hca03da5_0
jpeg 9e h80987f9_1
jupyter_client 8.6.0 py38hca03da5_0
jupyter_core 5.5.0 py38hca03da5_0
kiwisolver 1.4.5 pypi_0 pypi
lazy-loader 0.3 pypi_0 pypi
lcms2 2.12 hba8e193_0
lerc 3.0 hc377ac9_0
libblas 3.9.0 21_osxarm64_openblas conda-forge
libcblas 3.9.0 21_osxarm64_openblas conda-forge
libcxx 14.0.6 h848a8c0_0
libdeflate 1.17 h80987f9_1
libffi 3.4.4 hca03da5_0
libgfortran 5.0.0 11_3_0_hca03da5_28
libgfortran5 11.3.0 h009349e_28
liblapack 3.9.0 21_osxarm64_openblas conda-forge
libopenblas 0.3.21 h269037a_0
libpng 1.6.39 h80987f9_0
libprotobuf 3.20.3 h514c7bf_0
libsodium 1.0.18 h1a28f6b_0
libtiff 4.5.1 h313beb8_0
libuv 1.44.2 h80987f9_0
libwebp 1.3.2 ha3663a8_0
libwebp-base 1.3.2 h80987f9_0
llvm-openmp 14.0.6 hc6e5704_0
lz4-c 1.9.4 h313beb8_0
markdown 3.5.2 pypi_0 pypi
markupsafe 2.1.4 pypi_0 pypi
matplotlib 3.7.4 pypi_0 pypi
matplotlib-inline 0.1.6 py38hca03da5_0
mpc 1.1.0 h8c48613_1
mpfr 4.0.2 h695f6f0_1
mpmath 1.3.0 py38hca03da5_0
natsort 8.4.0 pypi_0 pypi
ncurses 6.4 h313beb8_0
nest-asyncio 1.5.6 py38hca03da5_0
networkx 3.1 py38hca03da5_0
ninja 1.10.2 hca03da5_5
ninja-base 1.10.2 h525c30c_5
numpy 1.24.3 py38h1398885_0
numpy-base 1.24.3 py38h90707a3_0
oauthlib 3.2.2 pypi_0 pypi
olefile 0.47 pyhd8ed1ab_0 conda-forge
opencv-python 4.9.0.80 pypi_0 pypi
openjpeg 2.3.0 h7a6adac_2
openssl 1.1.1w h1a28f6b_0
packaging 23.1 py38hca03da5_0
parso 0.8.3 pyhd3eb1b0_0
pexpect 4.9.0 pypi_0 pypi
pickleshare 0.7.5 pyhd3eb1b0_1003
pillow 10.2.0 pypi_0 pypi
pip 23.3.1 py38hca03da5_0
platformdirs 3.10.0 py38hca03da5_0
prompt-toolkit 3.0.43 py38hca03da5_0
prompt_toolkit 3.0.42 hd8ed1ab_0 conda-forge
protobuf 4.25.2 pypi_0 pypi
psutil 5.9.0 py38h1a28f6b_0
ptyprocess 0.7.0 pyhd3eb1b0_2
pure_eval 0.2.2 pyhd3eb1b0_0
pyasn1 0.5.1 pypi_0 pypi
pyasn1-modules 0.3.0 pypi_0 pypi
pycparser 2.21 pyhd3eb1b0_0
pygments 2.15.1 py38hca03da5_1
pyopenssl 23.2.0 py38hca03da5_0
pyparsing 3.1.1 pypi_0 pypi
pysocks 1.7.1 py38hca03da5_0
python 3.8.13 hbdb9e5c_1
python-dateutil 2.8.2 pyhd3eb1b0_0
python_abi 3.8 2_cp38 conda-forge
pytorch 2.1.0 gpu_mps_py38h87e4ab7_100
pytorch-cpu 1.9.0 cpu_py38hd610c6a_2 conda-forge
pywavelets 1.4.1 pypi_0 pypi
pyyaml 6.0.1 pypi_0 pypi
pyzmq 25.1.2 py38h313beb8_0
readline 8.2 h1a28f6b_0
requests 2.31.0 py38hca03da5_0
requests-oauthlib 1.3.1 pypi_0 pypi
rsa 4.9 pypi_0 pypi
safetensors 0.4.2 pypi_0 pypi
scikit-image 0.21.0 pypi_0 pypi
scipy 1.10.1 pypi_0 pypi
setuptools 68.2.2 py38hca03da5_0
six 1.16.0 pyhd3eb1b0_1
sleef 3.5.1 h80987f9_2
sqlite 3.41.2 h80987f9_0
stack-data 0.6.3 pypi_0 pypi
stack_data 0.2.0 pyhd3eb1b0_0
sympy 1.12 py38hca03da5_0
tensorboard 2.14.0 pypi_0 pypi
tensorboard-data-server 0.7.2 pypi_0 pypi
tifffile 2023.7.10 pypi_0 pypi
timm 0.9.12 pypi_0 pypi
tk 8.6.12 hb8d0fd4_0
torchvision 0.15.2 cpu_py38h31aa045_0
tornado 6.3.3 py38h80987f9_0
tqdm 4.66.1 pypi_0 pypi
traitlets 5.7.1 py38hca03da5_0
typing-extensions 4.9.0 py38hca03da5_1
typing_extensions 4.9.0 py38hca03da5_1
urllib3 2.1.0 pypi_0 pypi
wcwidth 0.2.5 pyhd3eb1b0_0
werkzeug 3.0.1 pypi_0 pypi
wheel 0.41.2 py38hca03da5_0
xz 5.4.5 h80987f9_0
zeromq 4.3.5 h313beb8_0
zipp 3.17.0 py38hca03da5_0
zlib 1.2.13 h5a0b063_0
zstd 1.5.5 hd90d995_0
Curious if you have seen this before or have any suggestions to debug this? Thank you!
Can you please make a colab notebook for the project?
My system doesn't fit the system required for installation and running your code, so having a colab notebook would allow me to use it.
Thanks
Hi, thanks for sharing this great work, i found something strange in the inference code
Line 769 in edd9838
When mixing the results, seems there is a bug, the two min_fwd should be one min_bwd? thanks
Thank you for a great project,Whether it can be used for other visual tasks, video matting, video optical flow estimation stability
@RaymondWang987: We have tried NVDS with MiDaS, DPT, MiDaS 3.1, and NewCRFs. The results are quite satisfactory. You can simply change the depth predictor to MiDaS 3.1 (only adjusting one line in our demo code) and our NVDS can produce significant improvement in temporal consistency.
I suppose you were refering to these lines:
dpt = MidasNet_large('./dpt/checkpoints/midas_v21-f6b98070.pt', non_negative=True).to(device_flow)
dpt = DPTDepthModel(path='./dpt/checkpoints/dpt_large-midas-2f21e586.pt', etc.
I thought I could simply point it to a different model file (dpt_beit_large_512.pt
from MiDaS 3.1) but it crashes so it seems it's not that easy.
Would you mind adding some details?
Thank you very much for your excellent work!
I noticed that the VDW dataset includes range_avg.txt (2 numbers), shift_scale_lr.txt (4 numbers), and ver_ratio.txt (2 numbers). I would like to know the meaning of these numbers. Thank you very much for your response, and once again, thank you for your outstanding work.
Dear NVDS authors,
Thank you for publishing this outstanding work. However, I have some questions while reading your paper. Since the depth prediction network is fixed during the training of the stabilization network, I would like to understand why there is a spatial loss term L(t-1). According to my understanding, during inference, the stabilization network takes four depth inputs and outputs the depth for the target frame, without explicitly providing the depth for t-1. So, during training, why is there a spatial loss term L(t-1)? Does the stabilization network simultaneously output stabilization depth for all four frames? If not, does it involve inferring t-1 depth twice during each gradient backward pass – once for input t-4 to t-1, producing the depth for t-1, and another for input t-3 to t, producing the depth for t, and then calculating the loss?
Apart from this question, I would also like to understand how the temporal loss during training, which uses t-1 depth, is obtained.
Thank you for your clarification.
Is it possible to use a newer depth model like Patchfusion or Marigold?
Thank you for your good job.
Rgb images have different resolutions with depth images, If I want to align the resolution of two images, do I simply resize them?
I tried to install NVDS on the Google colab platform, but I get the error below. Any suggestions?
!python infer_NVDS_dpt_bi.py --/content/drive/MyDrive/NVDS-main/demo_outputs/dpt_init/000423/ --vnum 000423 --infer_w 896 --infer_h 384
!python infer_NVDS_midas_bi.py --/content/drive/MyDrive/NVDS-main/demo_outputs/midas_init/000423/ --vnum 000423 --infer_w 896 --infer_h 384
Traceback (most recent call last):
File "/content/drive/MyDrive/NVDS-main/infer_NVDS_dpt_bi.py", line 22, in
from full_model import *
File "/content/drive/MyDrive/NVDS-main/full_model.py", line 13, in
from stabilization_attention import BasicLayer3d3
File "/content/drive/MyDrive/NVDS-main/stabilization_attention.py", line 8, in
from einops import rearrange
ModuleNotFoundError: No module named 'einops'
Traceback (most recent call last):
File "/content/drive/MyDrive/NVDS-main/infer_NVDS_midas_bi.py", line 22, in
from full_model import *
File "/content/drive/MyDrive/NVDS-main/full_model.py", line 13, in
from stabilization_attention import BasicLayer3d3
File "/content/drive/MyDrive/NVDS-main/stabilization_attention.py", line 8, in
from einops import rearrange
ModuleNotFoundError: No module named 'einops'
You mention in the instructions that we can split up the tasks to save some memory. How would I go about doing that? Save the depth as arrays and write a new inference.py code? Or is it supported with some arguments?
Thanks for sharing the great work!
I saw you normalized the NYUD2's depth with
NVDS/test_NYU_depth_metrics.py
Line 58 in 8030e99
作者你好,我在复现您提供的github代码时出现了以下的报错,似乎和mmcv-full有关,我是通过pip install mmcv-full==1.3.0 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html以及pip install mmcv-full==1.3.0都尝试过,但都是这个报错。请问有什么解决方法吗,以下是报错信息:
(NVDS) qjc@omnisky:/data3/qjcnerf/NVDS$ CUDA_VISIBLE_DEVICES=0 python infer_NVDS_dpt_bi.py --base_dir ./demo_outputs/dpt_init/000423/ --vnum 000423 --infer_w 896 --infer_h 384
Traceback (most recent call last):
File "infer_NVDS_dpt_bi.py", line 19, in
from backbone import *
File "/data3/qjcnerf/NVDS/backbone.py", line 15, in
from mmseg.models.builder import BACKBONES
File "/data1/qjc_new/anaconda/envs/NVDS/lib/python3.8/site-packages/mmseg/models/init.py", line 1, in
from .backbones import * # noqa: F401,F403
File "/data1/qjc_new/anaconda/envs/NVDS/lib/python3.8/site-packages/mmseg/models/backbones/init.py", line 2, in
from .fast_scnn import FastSCNN
File "/data1/qjc_new/anaconda/envs/NVDS/lib/python3.8/site-packages/mmseg/models/backbones/fast_scnn.py", line 7, in
from mmseg.models.decode_heads.psp_head import PPM
File "/data1/qjc_new/anaconda/envs/NVDS/lib/python3.8/site-packages/mmseg/models/decode_heads/init.py", line 16, in
from .point_head import PointHead
File "/data1/qjc_new/anaconda/envs/NVDS/lib/python3.8/site-packages/mmseg/models/decode_heads/point_head.py", line 6, in
from mmcv.ops import point_sample
File "/data1/qjc_new/anaconda/envs/NVDS/lib/python3.8/site-packages/mmcv/ops/init.py", line 1, in
from .bbox import bbox_overlaps
File "/data1/qjc_new/anaconda/envs/NVDS/lib/python3.8/site-packages/mmcv/ops/bbox.py", line 3, in
ext_module = ext_loader.load_ext('_ext', ['bbox_overlaps'])
File "/data1/qjc_new/anaconda/envs/NVDS/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 11, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "/data1/qjc_new/anaconda/envs/NVDS/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'mmcv._ext'
以下是我的环境信息:
`(NVDS) qjc@omnisky:/data3/qjcnerf/NVDS$ conda list
packages in environment at /data1/qjc_new/anaconda/envs/NVDS:
Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
addict 2.4.0 pypi_0 pypi
asttokens 2.2.1 pypi_0 pypi
attr 0.3.2 pypi_0 pypi
backcall 0.2.0 pypi_0 pypi
blas 1.0 mkl
bzip2 1.0.8 h7f98852_4 conda-forge
ca-certificates 2023.7.22 hbcca054_0 conda-forge
certifi 2023.7.22 pypi_0 pypi
charset-normalizer 3.2.0 pypi_0 pypi
click 8.1.6 pypi_0 pypi
colorama 0.4.6 pypi_0 pypi
cudatoolkit 11.1.1 ha002fc5_10 conda-forge
cycler 0.11.0 pypi_0 pypi
decorator 5.1.1 pypi_0 pypi
einops 0.4.1 pypi_0 pypi
executing 1.2.0 pypi_0 pypi
ffmpeg 4.3 hf484d3e_0 pytorch
fonttools 4.41.1 pypi_0 pypi
freetype 2.10.4 h0708190_1 conda-forge
gmp 6.2.1 h58526e2_0 conda-forge
gnutls 3.6.13 h85f3911_1 conda-forge
h5py 3.9.0 pypi_0 pypi
idna 3.4 pypi_0 pypi
imageio 2.31.1 pypi_0 pypi
importlib-metadata 6.8.0 pypi_0 pypi
intel-openmp 2022.1.0 h9e868ea_3769
ipython 8.5.0 pypi_0 pypi
jedi 0.18.2 pypi_0 pypi
jpeg 9b h024ee3a_2
kiwisolver 1.4.4 pypi_0 pypi
lame 3.100 h7f98852_1001 conda-forge
ld_impl_linux-64 2.38 h1181459_1
libffi 3.3 he6710b0_2
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libiconv 1.17 h166bdaf_0 conda-forge
libpng 1.6.37 h21135ba_2 conda-forge
libstdcxx-ng 11.2.0 h1234567_1
libtiff 4.1.0 h2733197_1
libuv 1.43.0 h7f98852_0 conda-forge
lz4-c 1.9.3 h9c3ff4c_1 conda-forge
markdown 3.4.3 pypi_0 pypi
markdown-it-py 3.0.0 pypi_0 pypi
matplotlib 3.5.3 pypi_0 pypi
matplotlib-inline 0.1.6 pypi_0 pypi
mdurl 0.1.2 pypi_0 pypi
mkl 2022.1.0 hc2b9512_224
mmcv 1.3.0 pypi_0 pypi
mmcv-full 1.3.0 pypi_0 pypi
mmengine 0.8.2 pypi_0 pypi
mmsegmentation 0.11.0 pypi_0 pypi
model-index 0.1.11 pypi_0 pypi
ncurses 6.4 h6a678d5_0
nettle 3.6 he412f7d_0 conda-forge
ninja 1.11.0 h924138e_0 conda-forge
nose 1.3.7 pypi_0 pypi
numpy 1.23.2 pypi_0 pypi
olefile 0.46 pyh9f0ad1d_1 conda-forge
opencv-python 4.8.0.74 pypi_0 pypi
opencv-python-headless 4.8.0.74 pypi_0 pypi
opendatalab 0.0.9 pypi_0 pypi
openh264 2.1.1 h780b84a_0 conda-forge
openmim 0.3.9 pypi_0 pypi
openssl 1.1.1u h7f8727e_0
ordered-set 4.1.0 pypi_0 pypi
packaging 23.1 pypi_0 pypi
pandas 2.0.3 pypi_0 pypi
parso 0.8.3 pypi_0 pypi
pexpect 4.8.0 pypi_0 pypi
pickleshare 0.7.5 pypi_0 pypi
pillow 10.0.0 pypi_0 pypi
pip 23.1.2 py38h06a4308_0
platformdirs 3.9.1 pypi_0 pypi
prettytable 3.8.0 pypi_0 pypi
prompt-toolkit 3.0.39 pypi_0 pypi
ptyprocess 0.7.0 pypi_0 pypi
pure-eval 0.2.2 pypi_0 pypi
pycryptodome 3.18.0 pypi_0 pypi
pygments 2.15.1 pypi_0 pypi
pyparsing 3.1.0 pypi_0 pypi
python 3.8.13 haa1d7c7_1
python-dateutil 2.8.2 pypi_0 pypi
pytorch 1.9.0 py3.8_cuda11.1_cudnn8.0.5_0 pytorch
pytz 2023.3 pypi_0 pypi
pyyaml 6.0.1 pypi_0 pypi
readline 8.2 h5eee18b_0
requests 2.31.0 pypi_0 pypi
resnest 0.0.5 pypi_0 pypi
rfconv 0.0.2b20210406 pypi_0 pypi
rich 13.4.2 pypi_0 pypi
scipy 1.9.1 pypi_0 pypi
setuptools 67.8.0 py38h06a4308_0
six 1.16.0 pypi_0 pypi
sqlite 3.41.2 h5eee18b_0
stack-data 0.6.2 pypi_0 pypi
tabulate 0.9.0 pypi_0 pypi
termcolor 2.3.0 pypi_0 pypi
terminaltables 3.1.10 pypi_0 pypi
timm 0.6.7 pypi_0 pypi
tk 8.6.12 h1ccaba5_0
tomli 2.0.1 pypi_0 pypi
torchvision 0.10.0 py38_cu111 pytorch
tqdm 4.65.0 pypi_0 pypi
traitlets 5.9.0 pypi_0 pypi
typing_extensions 4.7.1 pyha770c72_0 conda-forge
tzdata 2023.3 pypi_0 pypi
urllib3 2.0.4 pypi_0 pypi
wcwidth 0.2.6 pypi_0 pypi
wheel 0.38.4 py38h06a4308_0
xz 5.4.2 h5eee18b_0
yapf 0.40.1 pypi_0 pypi
zipp 3.16.2 pypi_0 pypi
zlib 1.2.13 h5eee18b_0
zstd 1.4.9 haebb681_0
`
I wonder if the input image size is fixed, as I run into some problems when I use the images of different resolutions (e.g., 688*384 ) , CUDA_VISIBLE_DEVICES=0 python infer_NVDS_dpt_bi.py --base_dir ./demo_outputs/dpt_init/kid_running/ --vnum kid_running --infer_w 688 --infer_h 384
let us begin test NVDS(DPT) demo
Load checkpoint: ./gmflow/checkpoints/gmflow_sintel-0c07dcb3.pth
******self.shift_size: 0
here mask none
******self.shift_size: 0
here mask none
******self.shift_size: 0
here mask none
******self.shift_size: 0
here mask none
/opt/conda/envs/NVDS/lib/python3.8/site-packages/torch/nn/functional.py:3609: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
warnings.warn(
Traceback (most recent call last):
File "infer_NVDS_dpt_bi.py", line 396, in
outputs = dpt.forward(rgb)
File "/data_ssd/home/z00647125/NVDS/dpt/models.py", line 115, in forward
inv_depth = super().forward(x).squeeze(dim=1)
File "/data_ssd/home/z00647125/NVDS/dpt/models.py", line 80, in forward
path_3 = self.scratch.refinenet3(path_4, layer_3_rn)
File "/opt/conda/envs/NVDS/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/data_ssd/home/z00647125/NVDS/dpt/blocks.py", line 372, in forward
output = self.skip_add.add(output, res)
File "/opt/conda/envs/NVDS/lib/python3.8/site-packages/torch/nn/quantized/modules/functional_modules.py", line 43, in add
r = torch.add(x, y)
RuntimeError: The size of tensor a (44) must match the size of tensor b (43) at non-singleton dimension 3
Hello
Will you be able to release the evaluation script for sintel dataset? If not, do you have any guidance on how to run the evaluation on Sintel dataset? I am unable to reproduce the results for sintel dataset! Thank you!
David
Hi, thanks for your great work for video depth!
I have a question related to temporal loss (Eq.4 in the main paper): in my opinion, the warped depth values from time i might not be equal to the corresponding depth value at time j if we don't take camera pose and 3D transformation into consideration. Do you think this is a bug or I miss something here?
What should I do if I want to estimate the depth of the video stream directly, will the delay be large?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.