Giter Site home page Giter Site logo

hd3's Introduction

HD3

This is a PyTorch implementation of our paper:

Hierarchical Discrete Distribution Decomposition for Match Density Estimation (CVPR 2019)

Zhichao Yin, Trevor Darrell, Fisher Yu

We propose a framework suitable for learning probabilistic pixel correspondences. It has applications including but not limited to stereo matching and optical flow, with inherent uncertainty estimation. HD3 achieves state-of-the-art results for both tasks on established benchmarks (KITTI & MPI Sintel).

arxiv preprint: (https://arxiv.org/abs/1812.06264)

Requirement

This code has been tested with Python 3.6, PyTorch 1.0 and CUDA 9.0 on Ubuntu 16.04.

Getting Started

  • Install PyTorch 1.0 and we recommend using anaconda3 for managing the python environment. You can install all the dependencies by the following:
pip install -r requirements.txt

Model Training

To train a model on a specific dataset, simply run

bash scripts/train.sh

Note the scripts contain several placeholders which you should replace with your customized choices. For instance, you can specify the dataset type (e.g. FlyingChairs) via --dataset_name, alternate the network architecture via --encoder and --decoder, and switch the task (stereo or flow) you solve via --task. You can also partly load the weights of a pretrained backbone network via --pretrain_base (download ImageNet pretrained DLA-34 here), or strictly initialize the weights from a pretrained model via --pretrain.

You can then start a tensorboard session by

tensorboard --logdir=/path/to/log/files --port=8964

and visualize your training progress by accessing https://localhost:8964 on you browser.

  • We provide the learning rate schedules and augmentation configurations in all of our experiments. For other detailed hyperparameters, please refer to our paper so as to reproduce our result.

Model Inference

To test a model on a folder of images, please run

bash scripts/test.sh

Please provide the list of image pair names and pass it to --data_list. This script will generate predictions for every pair of images and save them in the --save_folder with the same folder hierarchy as input images. You can choose the saved flow format (e.g. png or flo) via --flow_format. When the folder contains images of different input sizes (e.g. KITTI), please make sure the --batch_size is 1.

  • When the ground truth is available, you can optionally enable the argument --evaluate to calculate the End-Point-Error of your predictions. Please make sure the list consists of img-name1 img-name2 gtruth-name in each line.

Model Zoo

We provide pretrained models for all of our experiments. To download them, simply run

bash scripts/download_models.sh

The names of the models come in the format of model-name_dataset-names. Models are named as hd3f/hd3s for optical flow and stereo matching. A suffix of c is appended for models with context module. The dataset_names indicates our dataset schedule for training the model. You should be able to obtain similar results by running the test script we provide.

Citation

If you find our work or our repo useful in your research, please consider citing our paper:

@InProceedings{Yin_2019_CVPR,
author = {Yin, Zhichao and Darrell, Trevor and Yu, Fisher},
title = {Hierarchical Discrete Distribution Decomposition for Match Density Estimation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}

FAQ

  • Why are the model outputs different even for the same input in different runs?

    Some PyTorch ops are non-deterministic (e.g. torch.tensor.scatter_add_). If you fix all the random seeds for Python and PyTorch, you shall get identical results.

  • Why does the model finetuned on the KITTI dataset exhibit artifacts in the sky regions?

    This is due to the limited amount of data during finetuning stage. Effective solutions to resolve it include an additional smoothness loss term during finetuning and knowledge distillation from the model pretrained on the synthetic datasets.

  • Why does my evaluation metric look abnormal?

    Please confirm the synthetic dataset you are using is DispNet/FlowNet2.0 dataset subsets rather than the original complete version (the data format has subtle differences actually).

Acknowledgements

We thank Houning Hu for making the teaser image, Simon Niklaus for the correlation operator and Clément Pinard for the FlowNet implementation.

hd3's People

Contributors

yzcjtr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hd3's Issues

Some Results and Questions

Thanks for your wonderful work!

I download your code and test it under some image pairs captured by myself, but the results is not very good compared to the flownet2 ( https://github.com/NVIDIA/flownet2-pytorch ).

Can you give me some advises about how to improve the result based on your code ?
Thanks.

Pair 1:
left1
righ1

Results of FLowNet2
flow1

Results of hd3
left1


Pair 2:
left4
righ4

FlowNet2
flow4

dh3
left4

Question about pretraining in stereo task

Thanks for your wonderful work!
One thing that bothers me is why you don’t use sceneflow dataset during pre-training in stereo task. The sceneflow dataset has more data than flythings3D subset, and there are monka and driving subset in sceneflow dataset. In theory, using sceneflow dataset has better generalization performance.

Thanks~

When would you plan to open the code?

Thank you for planning to share the source code.
This work is so amazing and I would like to check your code.

Could you share the day you plan to open your code?
Thanks,
Yuki

Some questions

  1. I found two undefined variables H, W below. Can you fix it? What's the intuition behind the weight 4**(ds - l)?

    kld_loss += 4**(ds - l) / (H * W) * criterion(

  2. Why yield the parameters of the decoder twice in HD3Model.optim_parameters(), did you do it for a reason?

    def optim_parameters(self):

  3. Can you provide the training&evaluation loss curves of the stereo models for further analysis?

Thank you!

Large EPE error for stereo task

Hi,

Thanks for your code! When I run the inference code to evaluate disparity estimation on sceneflow dataset, the EPE is very large (it is 121 from a test image) from line 42 (https://github.com/ucbdrive/hd3/blob/master/hd3model.py#L42). I visualized the output of the network curr_vect, I see all estimation are negative (https://github.com/ucbdrive/hd3/blob/master/inference.py#L197). I can confirm I am using the stereo network since I am using the model hd3sc_things-57947496.pth with context mode on.

Is the EPE reported using pixel? Is this normal? I am not sure if the EPE is only for optical flow or can be used for disparity estimation as well.

Thank you very much!

Question about how occlusion is handled

Dear authors:

Thank you very much for sharing your code! I have a question about how HD3 handles occlusion.

Reading you code and paper, after the cost volume is constructed, it seems the decoder uses 2D convolution to find the match density, which is then converted to motion vectors. What if there is occlusion where no ground truth motion vector can be found? E.g. in stereo matching, left images usually have unmatched region on the left. Can you share some insights?

Thank you again!

Some layers have no gradient when backward.

when I use DLA34, it seems that some layers have no gradient when backward.

Specifically, they are :
module.encoder.base.level3.project.0.weight
module.encoder.base.level3.project.1.weight
module.encoder.base.level3.project.1.bias
module.encoder.base.level4.project.0.weight
module.encoder.base.level4.project.1.weight
module.encoder.base.level4.project.1.bias

i don't why, though maybe it doesn't affect training.

Error in _prob2cornerflow() function

out, indice = max_pool(avg_pool(pr))

In this line, the indice is of type long int, and the d is also an int. however, the results of / maybe a float type number, resulting in an error in the following calculation.

Can you help me with this line? What is it about?

Question about cost volume

Hello, I'm sorry to ask you question again. I don't see the description of cost volume in the paper.
The code is not very understandable for me because it is related to cuda which I haven‘t learned.
Could you please describe the cost volume construction in general? Thank you very much!

Error running pre trained model

While running

python3 -u /home/raymond/hd3/inference.py
--task=stereo
--data_root=""
--data_list=comb.txt
--encoder=dlaup
--decoder=hda
--batch_size=1
--workers=16
--flow_format=png
--evaluate
--model_path=/home/raymond/hd3/model_zoo/hd3f_chairs_things_sintel-5b4ad51a.pth
--save_folder=/home/raymond/ml_stereo/hd3_predicts

Traceback (most recent call last): File "/home/raymond/hd3/inference.py", line 243, in <module> main() File "/home/raymond/hd3/inference.py", line 129, in main model.load_state_dict(checkpoint['state_dict'], strict=True) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 777, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for DataParallel: Missing key(s) in state_dict: "module.hd3net.Decoder_4.up.0.weight", "module.hd3net.Decoder_4.up.0.bias", "module.hd3net.Decoder_4.up.0.running_mean", "module.hd3net.Decoder_4.up.0.running_var", "module.hd3net.Decoder_4.up.2.weight", "module.hd3net.Decoder_4.up.3.weight", "module.hd3net.Decoder_4.up.3.bias", "module.hd3net.Decoder_4.up.3.running_mean", "module.hd3net.Decoder_4.up.3.running_var", "module.hd3net.cost_bn_5.weight", "module.hd3net.cost_bn_5.bias", "module.hd3net.cost_bn_5.running_mean", "module.hd3net.cost_bn_5.running_var", "module.hd3net.Decoder_5.mapping.block1.conv1.weight", "module.hd3net.Decoder_5.mapping.block1.bn2.weight", "module.hd3net.Decoder_5.mapping.block1.bn2.bias", "module.hd3net.Decoder_5.mapping.block1.bn2.running_mean", "module.hd3net.Decoder_5.mapping.block1.bn2.running_var", "module.hd3net.Decoder_5.mapping.block1.conv2.weight", "module.hd3net.Decoder_5.mapping.block1.shortcut.0.weight", "module.hd3net.Decoder_5.mapping.block2.bn1.weight", "module.hd3net.Decoder_5.mapping.block2.bn1.bias", "module.hd3net.Decoder_5.mapping.block2.bn1.running_mean", "module.hd3net.Decoder_5.mapping.block2.bn1.running_var", "module.hd3net.Decoder_5.mapping.block2.conv1.weight", "module.hd3net.Decoder_5.mapping.block2.bn2.weight", "module.hd3net.Decoder_5.mapping.block2.bn2.bias", "module.hd3net.Decoder_5.mapping.block2.bn2.running_mean", "module.hd3net.Decoder_5.mapping.block2.bn2.running_var", "module.hd3net.Decoder_5.mapping.block2.conv2.weight", "module.hd3net.Decoder_5.mapping.root.0.weight", "module.hd3net.Decoder_5.mapping.root.0.bias", "module.hd3net.Decoder_5.mapping.root.0.running_mean", "module.hd3net.Decoder_5.mapping.root.0.running_var", "module.hd3net.Decoder_5.mapping.root.2.weight", "module.hd3net.Decoder_5.cls.0.weight", "module.hd3net.Decoder_5.cls.0.bias", "module.hd3net.Decoder_5.cls.0.running_mean", "module.hd3net.Decoder_5.cls.0.running_var", "module.hd3net.Decoder_5.cls.2.weight", "module.hd3net.Decoder_5.cls.2.bias". size mismatch for module.hd3net.cost_bn_0.weight: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_0.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_0.running_mean: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_0.running_var: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_0.mapping.block1.conv1.weight: copying a param with shape torch.Size([128, 81, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 9, 3, 3]). size mismatch for module.hd3net.Decoder_0.mapping.block1.shortcut.0.weight: copying a param with shape torch.Size([128, 81, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 9, 1, 1]). size mismatch for module.hd3net.Decoder_0.cls.2.weight: copying a param with shape torch.Size([81, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([9, 128, 1, 1]). size mismatch for module.hd3net.Decoder_0.cls.2.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_0.up.2.weight: copying a param with shape torch.Size([128, 81, 4, 4]) from checkpoint, the shape in current model is torch.Size([128, 9, 4, 4]). size mismatch for module.hd3net.Decoder_0.up.3.weight: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_0.up.3.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_0.up.3.running_mean: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_0.up.3.running_var: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_1.weight: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_1.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_1.running_mean: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_1.running_var: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_1.mapping.block1.conv1.weight: copying a param with shape torch.Size([128, 676, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 531, 3, 3]). size mismatch for module.hd3net.Decoder_1.mapping.block1.shortcut.0.weight: copying a param with shape torch.Size([128, 676, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 531, 1, 1]). size mismatch for module.hd3net.Decoder_1.cls.2.weight: copying a param with shape torch.Size([81, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([9, 128, 1, 1]). size mismatch for module.hd3net.Decoder_1.cls.2.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_1.up.2.weight: copying a param with shape torch.Size([128, 81, 4, 4]) from checkpoint, the shape in current model is torch.Size([128, 9, 4, 4]). size mismatch for module.hd3net.Decoder_1.up.3.weight: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_1.up.3.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_1.up.3.running_mean: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_1.up.3.running_var: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_2.weight: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_2.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_2.running_mean: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_2.running_var: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_2.mapping.block1.conv1.weight: copying a param with shape torch.Size([128, 420, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 275, 3, 3]). size mismatch for module.hd3net.Decoder_2.mapping.block1.shortcut.0.weight: copying a param with shape torch.Size([128, 420, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 275, 1, 1]). size mismatch for module.hd3net.Decoder_2.cls.2.weight: copying a param with shape torch.Size([81, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([9, 128, 1, 1]). size mismatch for module.hd3net.Decoder_2.cls.2.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_2.up.2.weight: copying a param with shape torch.Size([128, 81, 4, 4]) from checkpoint, the shape in current model is torch.Size([128, 9, 4, 4]). size mismatch for module.hd3net.Decoder_2.up.3.weight: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_2.up.3.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_2.up.3.running_mean: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_2.up.3.running_var: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_3.weight: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_3.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_3.running_mean: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_3.running_var: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_3.mapping.block1.conv1.weight: copying a param with shape torch.Size([128, 292, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 147, 3, 3]). size mismatch for module.hd3net.Decoder_3.mapping.block1.shortcut.0.weight: copying a param with shape torch.Size([128, 292, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 147, 1, 1]). size mismatch for module.hd3net.Decoder_3.cls.2.weight: copying a param with shape torch.Size([81, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([9, 128, 1, 1]). size mismatch for module.hd3net.Decoder_3.cls.2.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_3.up.2.weight: copying a param with shape torch.Size([128, 81, 4, 4]) from checkpoint, the shape in current model is torch.Size([128, 9, 4, 4]). size mismatch for module.hd3net.Decoder_3.up.3.weight: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_3.up.3.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_3.up.3.running_mean: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_3.up.3.running_var: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_4.weight: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_4.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_4.running_mean: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.cost_bn_4.running_var: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]). size mismatch for module.hd3net.Decoder_4.mapping.block1.conv1.weight: copying a param with shape torch.Size([128, 228, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 83, 3, 3]). size mismatch for module.hd3net.Decoder_4.mapping.block1.shortcut.0.weight: copying a param with shape torch.Size([128, 228, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 83, 1, 1]). size mismatch for module.hd3net.Decoder_4.cls.2.weight: copying a param with shape torch.Size([81, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([9, 128, 1, 1]). size mismatch for module.hd3net.Decoder_4.cls.2.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([9]).

I saw someone have a similar error like this before, except they had the context flag in, and the -c option in the saved model, which I removed

running pretrained files

Hi,

first of all thank you for sharing your code and the pre-trained files!

I was trying to evaluate the performance of the network (trained on FlyingChairs only) on Kitti.
However, it tells me that there are errors in loading the state_dict.
Have all the pre-trained models been trained with the same settings and the final published version of the source code? Or have I set something up incorrectly?

Best regards markus

python -u inference.py \
  --task=flow \
  --data_root="${KITTI2012}" \
  --data_list="lists/KITTI_flow_test_on_train_2012.txt" \
  --context \
  --encoder=dlaup \
  --decoder=hda \
  --batch_size=1 \
  --workers=12 \
  --flow_format=png \
  --evaluate \
  --model_path="hd3f_chairs-04bf114d.pth" \
  --save_folder="Kitti2012_eval/" 
:
:
:
[2019-06-14 21:08:30,103 INFO inference.py line 127 12311] => loading checkpoint '/home/markus/sources/hd3/model_zoo/hd3f_chairs-04bf114d.pth'                                                                     
Traceback (most recent call last):                                                                                                                                                                                 
  File "inference.py", line 243, in <module>                                                                                                                                                                       
    main()                                                                                                                                                                                                         
  File "inference.py", line 129, in main                                                                                                                                                                           
    model.load_state_dict(checkpoint['state_dict'], strict=True)                                                                                                                                                   
  File "/home/markus/anaconda3/envs/pt/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict                                                                                       
    self.__class__.__name__, "\n\t".join(error_msgs)))                                                                                                                                                             
RuntimeError: Error(s) in loading state_dict for DataParallel:                                                                                                                                                     
        Missing key(s) in state_dict: "module.hd3net.Decoder_4.dc_conv_0.0.weight", "module.hd3net.Decoder_4.dc_conv_0.1.weight", "module.hd3net.Decoder_4.dc_conv_0.1.bias", "module.hd3net.Decoder_4.dc_conv_0.1$
running_mean", "module.hd3net.Decoder_4.dc_conv_0.1.running_var", "module.hd3net.Decoder_4.dc_conv_1.0.weight", "module.hd3net.Decoder_4.dc_conv_1.1.weight", "module.hd3net.Decoder_4.dc_conv_1.1.bias", "module.$
d3net.Decoder_4.dc_conv_1.1.running_mean", "module.hd3net.Decoder_4.dc_conv_1.1.running_var", "module.hd3net.Decoder_4.dc_conv_2.0.weight", "module.hd3net.Decoder_4.dc_conv_2.1.weight", "module.hd3net.Decoder_4$
dc_conv_2.1.bias", "module.hd3net.Decoder_4.dc_conv_2.1.running_mean", "module.hd3net.Decoder_4.dc_conv_2.1.running_var", "module.hd3net.Decoder_4.dc_conv_3.0.weight", "module.hd3net.Decoder_4.dc_conv_3.1.weigh$
", "module.hd3net.Decoder_4.dc_conv_3.1.bias", "module.hd3net.Decoder_4.dc_conv_3.1.running_mean", "module.hd3net.Decoder_4.dc_conv_3.1.running_var", "module.hd3net.Decoder_4.dc_conv_4.0.weight", "module.hd3net$
Decoder_4.dc_conv_4.1.weight", "module.hd3net.Decoder_4.dc_conv_4.1.bias", "module.hd3net.Decoder_4.dc_conv_4.1.running_mean", "module.hd3net.Decoder_4.dc_conv_4.1.running_var", "module.hd3net.Decoder_4.dc_conv$
5.0.weight", "module.hd3net.Decoder_4.dc_conv_5.1.weight", "module.hd3net.Decoder_4.dc_conv_5.1.bias", "module.hd3net.Decoder_4.dc_conv_5.1.running_mean", "module.hd3net.Decoder_4.dc_conv_5.1.running_var", "mod$
le.hd3net.Decoder_4.dc_conv_6.0.weight", "module.hd3net.Decoder_4.dc_conv_6.1.weight", "module.hd3net.Decoder_4.dc_conv_6.1.bias", "module.hd3net.Decoder_4.dc_conv_6.1.running_mean", "module.hd3net.Decoder_4.dc$
conv_6.1.running_var", "module.hd3net.Decoder_4.cls.weight", "module.hd3net.Decoder_4.cls.bias".                                                                                                                   
        Unexpected key(s) in state_dict: "module.hd3net.Decoder_4.mapping.block1.conv1.weight", "module.hd3net.Decoder_4.mapping.block1.bn2.weight", "module.hd3net.Decoder_4.mapping.block1.bn2.bias", "module.hd$
net.Decoder_4.mapping.block1.bn2.running_mean", "module.hd3net.Decoder_4.mapping.block1.bn2.running_var", "module.hd3net.Decoder_4.mapping.block1.bn2.num_batches_tracked", "module.hd3net.Decoder_4.mapping.block$
.conv2.weight", "module.hd3net.Decoder_4.mapping.block1.shortcut.0.weight", "module.hd3net.Decoder_4.mapping.block2.bn1.weight", "module.hd3net.Decoder_4.mapping.block2.bn1.bias", "module.hd3net.Decoder_4.mappi$
g.block2.bn1.running_mean", "module.hd3net.Decoder_4.mapping.block2.bn1.running_var", "module.hd3net.Decoder_4.mapping.block2.bn1.num_batches_tracked", "module.hd3net.Decoder_4.mapping.block2.conv1.weight", "mo$
ule.hd3net.Decoder_4.mapping.block2.bn2.weight", "module.hd3net.Decoder_4.mapping.block2.bn2.bias", "module.hd3net.Decoder_4.mapping.block2.bn2.running_mean", "module.hd3net.Decoder_4.mapping.block2.bn2.running_
var", "module.hd3net.Decoder_4.mapping.block2.bn2.num_batches_tracked", "module.hd3net.Decoder_4.mapping.block2.conv2.weight", "module.hd3net.Decoder_4.mapping.root.0.weight", "module.hd3net.Decoder_4.mapping.ro
ot.0.bias", "module.hd3net.Decoder_4.mapping.root.0.running_mean", "module.hd3net.Decoder_4.mapping.root.0.running_var", "module.hd3net.Decoder_4.mapping.root.0.num_batches_tracked", "module.hd3net.Decoder_4.map
ping.root.2.weight", "module.hd3net.Decoder_4.cls.0.weight", "module.hd3net.Decoder_4.cls.0.bias", "module.hd3net.Decoder_4.cls.0.running_mean", "module.hd3net.Decoder_4.cls.0.running_var", "module.hd3net.Decode
r_4.cls.0.num_batches_tracked", "module.hd3net.Decoder_4.cls.2.weight", "module.hd3net.Decoder_4.cls.2.bias".

跑出来的视差图好像不对

我想要得到KITTI_stereo_test_2015的视差图,于是
我执行了以下命令:

python -u inference.py \
  --task=stereo \
  --data_root=. \
  --data_list=./lists/KITTI_stereo_test_2015.txt \
  --context \
  --encoder=dlaup \
  --decoder=hda \
  --batch_size=1 \
  --workers=8 \
  --flow_format=png \
  --model_path=./model_zoo \
  --save_folder=./results

跑出来的结果不对
.\results\vis\kitti_stereo_2015\testing\image_2\000000_10.png

000000_10.png

.\results\vec\kitti_stereo_2015\testing\image_2\000000_10.png

000000_10.png

在这个过程中,我修改了源代码,因为我没有checkpoint:
inference.py第131行

    if os.path.isfile(args.model_path):
        logger.info("=> loading checkpoint '{}'".format(args.model_path))
        checkpoint = torch.load(args.model_path)
        model.load_state_dict(checkpoint['state_dict'], strict=True)
        logger.info("=> loaded checkpoint '{}'".format(args.model_path))
    else:
        print("=> no checkpoint found at '{}'".format(args.model_path))
        # raise RuntimeError("=> no checkpoint found at '{}'".format(args.model_path))

我手动下载了模型到./model_zoo文件夹:

hd3f_chairs-04bf114d.pth
hd3f_chairs_things-462a3896.pth
hd3f_chairs_things_kitti-41b15827.pth
hd3f_chairs_things_sintel-5b4ad51a.pth
hd3fc_chairs-1367436d.pth
hd3fc_chairs_things-0b92a7f5.pth
hd3fc_chairs_things_kitti-bfa97911.pth
hd3fc_chairs_things_sintel-0b6e4b67.pth
hd3s_things-8b4dcd6d.pth
hd3s_things_kitti-1243813e.pth
hd3sc_things-57947496.pth
hd3sc_things_kitti-368975c0.pth

我的数据位置:

.\kitti_stereo_2015\testing\image_2
.\kitti_stereo_2015\testing\image_3

我的运行环境是:

windows 10
CUDA 9.0
cuDNN 7.4.1

python:
cudatoolkit 9.0
cudnn 7.3.1
cupy 6.0.0
python 3.7.3
pytorch 1.1.0

Discrepancies model_zoo models vs. paper

Hi and thank you for sharing your wonderful work!

I was trying to reproduce the results you report on Kitti with the hd3fc_chairs_things_kitti-bfa97911.pth model from the Modelzoo. For this I used your inference script together with Kitti Training GT. While it matches quite nice for Kitti2012 I am getting some discrepancies for Kitti2015 that I can't explain.

The Inference script reports an Average End Point error of 1.40 pixels (paper states 1.31).
And when I use the C++ code from the Kitti Homepage together with the Kitti2015 training files I get 4.43% of Fl-all error (paper states 4.1%).

Is this the actual model you've used to obtain the numbers reported in your paper or a retrained one?
Am I doing something wrong?

Best Regards Markus

error when run train.sh

Hi,
Thank you for sharing your code and the pre-trained files!
I was trying to re-train the network with the pre-trained file on FlyingThings3D for stereo.
(hd3sc_things-57947496.pth trained on FlyingThings3D only)

I run the train file: train.sh as
CUDA_VISIBLE_DEVICES=3 python -u train.py
--dataset_name=FlyingThings3D
--train_root=/home/share/34916/SceneFlowDataset/FlyingThings3D
--train_list=lists/FlyingThings3D_train_stereo_.txt
--val_root=/home/share/34916/SceneFlowDataset/FlyingThings3D
--val_list=lists/FlyingThings3D_test_stereo_.txt
--task=stereo
--base_lr=0.0002
--encoder=dlaup
--decoder=hda
--context
--workers=4
--epochs=200
--batch_size=4
--evaluate
--batch_size_val=1
--pretrain=./outputs/model/model_zoo/hd3sc_things-57947496.pth
--visual_freq=20
--save_step=50
--save_path=./outputs/model

but but but
I got Error output。The output is all close to zero. I print the intermediate tensor in hd3/models/hd3net.py .
code as follow:
decoder = getattr(self, 'Decoder_' + str(l))
prob_map, up_feat = decoder(decoder_input)
curr_vect = density2vector(prob_map, self.dim, True)

        if l > 0:
            curr_vect += up_curr_vect                
            
        if self.task == 'stereo':
            curr_vect = torch.clamp(curr_vect, max=0)
            print("curr_vect mean: ", torch.mean(curr_vect))
            print("++++++++++++++++++++++")

For previous steps the mean of curr_vect is normal. Show below:
[2019-11-13 02:14:00,322 INFO train.py line 259 121140] Loss total 42.9904
curr_vect mean: tensor(-1.0633, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-2.1657, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-4.5257, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-9.1923, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-18.4706, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-36.9369, device='cuda:0', grad_fn=)
++++++++++++++++++++++
[2019-11-13 02:14:01,065 INFO train.py line 256 121140] Epoch: [1/200][7/5038] Data 0.001 (0.145) Batch 0.743 (3.269) Remain 914:55:07.

But after about a few hundred steps,the mean of curr_vect is almost all zero. Show below:
[2019-11-13 01:54:30,441 INFO train.py line 259 117631] Loss total 2.5231
curr_vect mean: tensor(-0.4105, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-0.0049, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-0.0027, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-0.0019, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-0.0009, device='cuda:0', grad_fn=)
++++++++++++++++++++++
curr_vect mean: tensor(-0.0006, device='cuda:0', grad_fn=)
++++++++++++++++++++++
[2019-11-13 01:54:31,190 INFO train.py line 256 117631] Epoch: [1/200][473/5038] Data 0.001 (0.003) Batch 0.750 (0.788) Remain 220:32:07.

I am very confused. I train on hd3sc_things-57947496.pth, the output was not better, but got worse
after hundreds steps.

Convert output disparity maps to depth

I was wondering if you knew the correct parameters to convert the disparity map to depth. I was using the formula:

D = baseline*focal / (disparity * img_size)

and I was using the parameters:

D = 0.54 * 721 / (disparity * 1242)

The outputs are on the order of 0.00 - 0.1, which doesn't make sense. Any idea what I'm missing here?

types of data augmenation

Thank you for your outstanding work! I have some problems about data augmentation.
I found that the model only uses scale,crop,HorizontalFlip and VerticalFlip to enchance the input data when you pretrain the model on chairs,not following the types of data augmentation in original Flownet and only add RandomPhotometric augmentations for fine-tuning of MPI-Sintel. Will this implements lead to the overfit on MPI-Sintel test set? Will your results become better on MPI-Sintel test set if adopt data augmentations of Flownet?

pretrained weights for pyramid feature extractor

First of all, thank you for sharing the weights of your fully trained models!

I was wondering if you could also share the pre-trained weights of your base feature extractor (DLA-34-Up) that you use at the start of your training. In your paper you state that you use an ImageNet pretrained model for this.
I did find pre-trained weights for DLA-34 which should be similar to DLA-34-Up except for the upsampling, but I'm not quite sure if this is what you meant and used.

It would be great if you could add the DLA-34-Up base model for the feature extractor or maybe add a few words/links to your Readme if the DLA-34 version is what you used.

Thank you

replace FunctionCorrelation cupy GPU

We have run hd3 with us video and result very good
We want to convert hd3 => TorchScript to add us program run in C++ demo for boss
But when script model we get error with re.search('(SIZE_)([0-4])(\()([^\)]*)(\))', strKernel) in cupy of _FunctionCorrelation

So whether have way replace calculate Correlation with cupy. just use operator between tensor with tensor on cpu

many thank

gpu memory usage

Hello! I wonder if there is any part in your code where I can set a parameter to use less gpu memory, or if you know of some way to do it using pytorch. I need to do this without losing precision, so resizing the input image is not an opition.

Thanks!

Inference error

Hi.
I'm trying to run your code:
python3 -u inference.py --task=stereo --data_list=test --save_folder=/home/oem/Workspace/Projects/hd3/ --encoder=dlaup --decoder=hda --model_path=/home/oem/Workspace/Projects/hd3/model_zoo/hd3s_things-8b4dcd6d.pth --data_root=/home/oem/Workspace/data/Sampler/Driving/RGB_cleanpass/ --batch_size=1 --flow_format=png --workers=16

I got:
[2020-11-23 05:11:44,837 INFO inference.py line 78 38672] Namespace(batch_size=1, context=False, data_list='test', data_root='/home/oem/Workspace/data/Sampler/Driving/RGB_cleanpass/', decoder='hda', encoder='dlaup', evaluate=False, flow_format='png', model_path='/home/oem/Workspace/Projects/hd3/model_zoo/hd3s_things-8b4dcd6d.pth', save_folder='/home/oem/Workspace/Projects/hd3/', task='stereo', workers=16)
[2020-11-23 05:11:44,837 INFO inference.py line 79 38672] => creating model ...
[2020-11-23 05:11:47,280 INFO inference.py line 124 38672] HD3Model(
...
[2020-11-23 05:11:47,290 INFO inference.py line 130 38672] => loading checkpoint '/home/oem/Workspace/Projects /hd3/model_zoo/hd3s_things-8b4dcd6d.pth'
[2020-11-23 05:11:47,392 INFO inference.py line 133 38672] => loaded checkpoint '/home/oem/Workspace/Projects /hd3/model_zoo/hd3s_things-8b4dcd6d.pth'
[2020-11-23 05:11:47,392 INFO inference.py line 144 38672] >>>>>>>>>>>>>>>> Start Test >>>>>>>>>>>>>>>>
Traceback (most recent call last):
File "inference.py", line 246, in
main()
File "inference.py", line 167, in main
output = model(
File "/home/oem/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/oem/.local/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/oem/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/oem/Workspace/Projects/hd3/hd3model.py", line 33, in forward
ms_prob, ms_vect = self.hd3net(torch.cat(img_list, 1))
File "/home/oem/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/oem/Workspace/Projects/hd3/models/hd3net.py", line 149, in forward
torch.cat([x[:, :3, :, :], x[:, 3:, :, :]], 0))
RuntimeError: Sizes of tensors must match except in dimension 1. Got 1 and 3 (The offending index is 0)

Can you tell what the problem might be?
It should be noted that for the launch i had to replace @cupy.util.memoize(for_each_device=True) in the file models/correlation.py in line 288 with @cupy.\_util.memoize(for_each_device=True) (added "_" before util).
The result is the same for different versions of python (3.6.12 and 3.8.3), pytorch (1.0.0 and 1.7.0) and CUDA (11.1 and 9.0).

Question on the flow normalization

Hello,
Could I ask why you normalize the flow by
vgrid[:,0,:,:] = 2.0*vgrid[:,0,:,:].clone() / max(W-1,1)-1.0 in the warping function?

Due to minus value in flow, the vgrid could also have minus and then the result of that line could be less then -1. Isnt that out of purposed vgrid range [-1, 1]?

Can we assume that vgrid == sum(grid, flow) always stay larger than 0?

cupy.cuda.compiler.CompileException: nvrtc: error: failed to load builtins

ubuntu 16.06
pytorch 1.01 (try the 0.4.0 still error)
python 3.5
cupy cupy_cuda90-6.0.0-cp35-cp35m-manylinux1_x86_64.whl

i have set
“export PATH=/usr/local/cuda-9.0/bin:$PATH
export CUDA_PATH=/usr/local/cuda-9.0:$CUDA_PATH
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH”

or using the command "CUDA_PATH=/usr/local/cuda-9.0 pip3 install cupy" to install cupy
the cupy is installed well
but when i "bash script/train.sh"
it occurs:
[2019-05-23 10:16:58,638 INFO train.py line 143 23818] => no checkpoint found at 'path_to_pretrained_model'
Traceback (most recent call last):
File "/home/lvhao/.local/lib/python3.5/site-packages/cupy/cuda/compiler.py", line 242, in compile
nvrtc.compileProgram(self.ptr, options)
File "cupy/cuda/nvrtc.pyx", line 98, in cupy.cuda.nvrtc.compileProgram
File "cupy/cuda/nvrtc.pyx", line 108, in cupy.cuda.nvrtc.compileProgram
File "cupy/cuda/nvrtc.pyx", line 53, in cupy.cuda.nvrtc.check_status
cupy.cuda.nvrtc.NVRTCError: NVRTC_ERROR unknown (7)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 361, in
main()
File "train.py", line 187, in main
args.batch_size)
File "train.py", line 236, in train
output = model(img_list=img_list, label_list=label_list, get_loss=True)
File "/home/lvhao/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/lvhao/.local/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 112, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/lvhao/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/lvhao/hd3-master/hd3model.py", line 32, in forward
ms_prob, ms_vect = self.hd3net(torch.cat(img_list, 1))
File "/home/lvhao/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/lvhao/hd3-master/models/hd3net.py", line 168, in forward
tensorFirst=ref_feat, tensorSecond=tar_feat_corr)
File "/home/lvhao/hd3-master/models/correlation.py", line 464, in FunctionCorrelation
return _FunctionCorrelation.apply(tensorFirst, tensorSecond)
File "/home/lvhao/hd3-master/models/correlation.py", line 328, in forward
'output': self.rbot0
File "cupy/util.pyx", line 50, in cupy.util.memoize.decorator.ret
File "/home/lvhao/hd3-master/models/correlation.py", line 291, in cupy_launch
return cupy.cuda.compile_with_cache(strKernel).get_function(strFunction)
File "/home/lvhao/.local/lib/python3.5/site-packages/cupy/cuda/compiler.py", line 136, in compile_with_cache
base = _preprocess('', options, arch)
File "/home/lvhao/.local/lib/python3.5/site-packages/cupy/cuda/compiler.py", line 97, in _preprocess
result = prog.compile(options)
File "/home/lvhao/.local/lib/python3.5/site-packages/cupy/cuda/compiler.py", line 246, in compile
raise CompileException(log, self.src, self.name, options)
cupy.cuda.compiler.CompileException: nvrtc: error: failed to load builtins

how to solve it?

error when run inference

Hi, I downloaded model zoo and run this script fro inference (pytorch 1.0.1, cuda 9.0 python 3.6):
python -u inference.py
--task=stereo
--data_root=/home/gtruong/project/hd3
--data_list=lists/giang.txt
--context
--encoder=dlaup
--decoder=hda
--batch_size=1
--workers=16
--flow_format=png
--model_path=/home/gtruong/project/hd3/model_zoo/hd3s_things_kitti-1243813e.pth
--save_folder=/home/gtruong/project/hd3/giang

The error I got:
[2019-08-29 14:55:17,755 INFO inference.py line 127 502088] => loading checkpoint 'D:\Reimplement\hd3/model_zoo/hd3s_things_kitti-1243813e.pth'
Traceback (most recent call last):
File "inference.py", line 243, in
main()
File "inference.py", line 129, in main
model.load_state_dict(checkpoint['state_dict'], strict=True)
File "C:\Users\Giang\Anaconda3\envs\basic\lib\site-packages\torch\nn\modules\module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DataParallel:
Missing key(s) in state_dict: "module.hd3net.Decoder_5.dc_conv_0.0.weight", "module.hd3net.Decoder_5.dc_conv_0.1.weight", "module.hd3net.Decoder_5.dc_conv_0.1.bias", "module.hd3net.Decoder_5.dc_conv_0.1.running_mean", "module.hd3net.Decoder_5.dc_conv_0.1.running_var", "module.hd3net.Decoder_5.dc_conv_1.0.weight", "module.hd3net.Decoder_5.dc_conv_1.1.weight", "module.hd3net.Decoder_5.dc_conv_1.1.bias", "module.hd3net.Decoder_5.dc_conv_1.1.running_mean", "module.hd3net.Decoder_5.dc_conv_1.1.running_var", "module.hd3net.Decoder_5.dc_conv_2.0.weight", "module.hd3net.Decoder_5.dc_conv_2.1.weight", "module.hd3net.Decoder_5.dc_conv_2.1.bias", "module.hd3net.Decoder_5.dc_conv_2.1.running_mean", "module.hd3net.Decoder_5.dc_conv_2.1.running_var", "module.hd3net.Decoder_5.dc_conv_3.0.weight", "module.hd3net.Decoder_5.dc_conv_3.1.weight", "module.hd3net.Decoder_5.dc_conv_3.1.bias", "module.hd3net.Decoder_5.dc_conv_3.1.running_mean", "module.hd3net.Decoder_5.dc_conv_3.1.running_var", "module.hd3net.Decoder_5.dc_conv_4.0.weight", "module.hd3net.Decoder_5.dc_conv_4.1.weight", "module.hd3net.Decoder_5.dc_conv_4.1.bias", "module.hd3net.Decoder_5.dc_conv_4.1.running_mean", "module.hd3net.Decoder_5.dc_conv_4.1.running_var", "module.hd3net.Decoder_5.dc_conv_5.0.weight", "module.hd3net.Decoder_5.dc_conv_5.1.weight", "module.hd3net.Decoder_5.dc_conv_5.1.bias", "module.hd3net.Decoder_5.dc_conv_5.1.running_mean", "module.hd3net.Decoder_5.dc_conv_5.1.running_var", "module.hd3net.Decoder_5.dc_conv_6.0.weight", "module.hd3net.Decoder_5.dc_conv_6.1.weight", "module.hd3net.Decoder_5.dc_conv_6.1.bias", "module.hd3net.Decoder_5.dc_conv_6.1.running_mean", "module.hd3net.Decoder_5.dc_conv_6.1.running_var", "module.hd3net.Decoder_5.cls.weight", "module.hd3net.Decoder_5.cls.bias".
Unexpected key(s) in state_dict: "module.hd3net.Decoder_5.mapping.block1.conv1.weight", "module.hd3net.Decoder_5.mapping.block1.bn2.weight", "module.hd3net.Decoder_5.mapping.block1.bn2.bias", "module.hd3net.Decoder_5.mapping.block1.bn2.running_mean", "module.hd3net.Decoder_5.mapping.block1.bn2.running_var", "module.hd3net.Decoder_5.mapping.block1.bn2.num_batches_tracked", "module.hd3net.Decoder_5.mapping.block1.conv2.weight", "module.hd3net.Decoder_5.mapping.block1.shortcut.0.weight", "module.hd3net.Decoder_5.mapping.block2.bn1.weight", "module.hd3net.Decoder_5.mapping.block2.bn1.bias", "module.hd3net.Decoder_5.mapping.block2.bn1.running_mean", "module.hd3net.Decoder_5.mapping.block2.bn1.running_var", "module.hd3net.Decoder_5.mapping.block2.bn1.num_batches_tracked", "module.hd3net.Decoder_5.mapping.block2.conv1.weight", "module.hd3net.Decoder_5.mapping.block2.bn2.weight", "module.hd3net.Decoder_5.mapping.block2.bn2.bias", "module.hd3net.Decoder_5.mapping.block2.bn2.running_mean", "module.hd3net.Decoder_5.mapping.block2.bn2.running_var", "module.hd3net.Decoder_5.mapping.block2.bn2.num_batches_tracked", "module.hd3net.Decoder_5.mapping.block2.conv2.weight", "module.hd3net.Decoder_5.mapping.root.0.weight", "module.hd3net.Decoder_5.mapping.root.0.bias", "module.hd3net.Decoder_5.mapping.root.0.running_mean", "module.hd3net.Decoder_5.mapping.root.0.running_var", "module.hd3net.Decoder_5.mapping.root.0.num_batches_tracked", "module.hd3net.Decoder_5.mapping.root.2.weight",
"module.hd3net.Decoder_5.cls.0.weight", "module.hd3net.Decoder_5.cls.0.bias", "module.hd3net.Decoder_5.cls.0.running_mean", "module.hd3net.Decoder_5.cls.0.running_var", "module.hd3net.Decoder_5.cls.0.num_batches_tracked", "module.hd3net.Decoder_5.cls.2.weight", "module.hd3net.Decoder_5.cls.2.bias".

encoder probelm in the train.py

Hi, in the model training, we need to initialize the weights from a pretrained model via --pretrain , if the encoder is dla, i can obtain pretrained model by download_models.sh in the model zoo.
But if I want to train a new model whose encoder is vgg, there is not pretrained model in model zoo , and what should I do ??

Question on KITTI training

Hi. In the stereo matching task, you trained the model on KITTI for 2000 epochs. But do you have a validation dataset? How could you know which epoch is the best model? Are you using the latest model to test?

I got similar result on the training set, and used the latest model to run the test. However the result is not as good as yours on the test dataset.

Thank you.

problem in test.sh

The parameters are as following:
--task=stereo
--data_root=./
--data_list=lists/KITTI_stereo_test_2015.txt
--encoder=dlaup
--decoder=hda
--batch_size=1
--workers=16
--flow_format=png
--model_path=./model_zoo/hd3s_things_kitti-1243813e.pth
--save_folder=./result

but the output as follows:
./result/vec/kitti_stereo_2015/testing/image_2/000000_10.png
000000_10
./result/vis/kitti_stereo_2015/testing/image_2/000000_10.png
000000_10

And what is my problem ??
Looking forward to your reply !

How to visualize Confidence Map?

Hi, I wonder how to visualize the Confidence Map like in the paper?
I am using optical flow, and I suppose I should get it here:

hd3/hd3model.py

Lines 36 to 37 in 1b0185d

if get_prob:
result['prob'] = ms_prob[-1]

but the result['prob'] is a tensor that has 81 channels...

Can you help me with this? Thanks.

Extra minus

Hello,

Thank you for sharing the code.

Looks like, there shouldn't be minus before read_pfm_file(), please, check.

disp = np.expand_dims(-read_pfm_file(file_name), axis=-1)

output tensors differ when parsing exactly same input

output1 = model(img_list=resized_img_list,label_list=label_list,get_vect=True, get_prob=True,get_epe=args.evaluate)
output2 = model(img_list=resized_img_list,label_list=label_list,get_vect=True, get_prob=True,get_epe=args.evaluate)
print("Equals: ", torch.equal(output1['vect'], output2['vect']))

prints

False

Is there some probabilistic layer on the network?

Number of epochs and training time

I wonder if there is a reason why we need to train the model for so many epochs (1000s for KITTI and Sintel). Does the model still improve? Or can we shorten the number of epochs?

Currently I'm trying training on FlyingChairs, but due to the GPU memory problem I can only train at a speed of 1 hour/epoch. I know that if more GPUs are available (e.g. 8), this can be about ~5x faster, but still that requires huge amount of time if the number of epochs is more than 1000...

So to sum up, the question is

  1. Is the large number of epochs necessary?
  2. If it really is, how much did it take to train? (when you did the experiment)

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.