Hello, I tried with your testing script for NYU-v2 dataset, it generates the following output.
arghya@arghya-Pulse-GL66-12UEK:~/CompletionFormer/src$ python3 main.py --dir_data ../data/nyudepthv2 --data_name NYU --split_json ../data_json/nyu.json --gpus 0 --max_depth 10.0 --num_sample 500 --test_only --pretrain ../src/pretrained/NYUv2.pt --save ../results
=== Arguments ===
address : localhost | affinity : TGASS | affinity_gamma : 0.5 | augment : True |
batch_size : 12 | betas : (0.9, 0.999) | conf_prop : True | data_name : NYU | dir_data : ../data/nyudepthv2 |
epochs : 72 | epsilon : 1e-08 | from_scratch : False | gamma : 0.5 | gpus : 0 |
legacy : False | lidar_lines : 64 | log_dir : ../experiments/ | loss : 1.0*L1+1.0*L2 | lr : 0.001 |
max_depth : 10.0 | milestones : [36, 48, 56, 64] | model : CompletionFormer | momentum : 0.9 | no_multiprocessing : False |
num_gpus : 1 | num_sample : 500 | num_summary : 4 | num_threads : 4 | opt_level : O0 |
optimizer : ADAMW | patch_height : 228 | patch_width : 304 | port : 29500 | preserve_input : False |
pretrain : ../src/pretrained/NYUv2.pt | print_freq : 1 | prop_kernel : 3 | prop_time : 6 | resume : False |
save : ../results | save_dir : ../experiments/230620_152501_../results | save_full : False | save_image : False | save_result_only : False |
seed : 43 | split_json : ../data_json/nyu.json | test_crop : False | test_only : True | top_crop : 0 |
warm_up : True | weight_decay : 0.01 |
2023-06-20 15:25:03,638 - mmseg - INFO - load checkpoint from local path: ./pretrained/pvt.pth
2023-06-20 15:25:03,716 - mmseg - WARNING - The model and loaded state dict do not match exactly
size mismatch for pos_embed1: copying a param with shape torch.Size([1, 3136, 64]) from checkpoint, the shape in current model is torch.Size([1, 12544, 64]).
size mismatch for patch_embed1.proj.weight: copying a param with shape torch.Size([64, 3, 4, 4]) from checkpoint, the shape in current model is torch.Size([64, 128, 2, 2]).
unexpected key in source state_dict: cls_token, norm.weight, norm.bias, head.weight, head.bias
missing keys in source state_dict: embed_layer1.0.conv1.weight, embed_layer1.0.bn1.weight, embed_layer1.0.bn1.bias, embed_layer1.0.bn1.running_mean, embed_layer1.0.bn1.running_var, embed_layer1.0.conv2.weight, embed_layer1.0.bn2.weight, embed_layer1.0.bn2.bias, embed_layer1.0.bn2.running_mean, embed_layer1.0.bn2.running_var, embed_layer1.1.conv1.weight, embed_layer1.1.bn1.weight, embed_layer1.1.bn1.bias, embed_layer1.1.bn1.running_mean, embed_layer1.1.bn1.running_var, embed_layer1.1.conv2.weight, embed_layer1.1.bn2.weight, embed_layer1.1.bn2.bias, embed_layer1.1.bn2.running_mean, embed_layer1.1.bn2.running_var, embed_layer1.2.conv1.weight, embed_layer1.2.bn1.weight, embed_layer1.2.bn1.bias, embed_layer1.2.bn1.running_mean, embed_layer1.2.bn1.running_var, embed_layer1.2.conv2.weight, embed_layer1.2.bn2.weight, embed_layer1.2.bn2.bias, embed_layer1.2.bn2.running_mean, embed_layer1.2.bn2.running_var, embed_layer2.0.conv1.weight, embed_layer2.0.bn1.weight, embed_layer2.0.bn1.bias, embed_layer2.0.bn1.running_mean, embed_layer2.0.bn1.running_var, embed_layer2.0.conv2.weight, embed_layer2.0.bn2.weight, embed_layer2.0.bn2.bias, embed_layer2.0.bn2.running_mean, embed_layer2.0.bn2.running_var, embed_layer2.0.downsample.0.weight, embed_layer2.0.downsample.1.weight, embed_layer2.0.downsample.1.bias, embed_layer2.0.downsample.1.running_mean, embed_layer2.0.downsample.1.running_var, embed_layer2.1.conv1.weight, embed_layer2.1.bn1.weight, embed_layer2.1.bn1.bias, embed_layer2.1.bn1.running_mean, embed_layer2.1.bn1.running_var, embed_layer2.1.conv2.weight, embed_layer2.1.bn2.weight, embed_layer2.1.bn2.bias, embed_layer2.1.bn2.running_mean, embed_layer2.1.bn2.running_var, embed_layer2.2.conv1.weight, embed_layer2.2.bn1.weight, embed_layer2.2.bn1.bias, embed_layer2.2.bn1.running_mean, embed_layer2.2.bn1.running_var, embed_layer2.2.conv2.weight, embed_layer2.2.bn2.weight, embed_layer2.2.bn2.bias, embed_layer2.2.bn2.running_mean, embed_layer2.2.bn2.running_var, embed_layer2.3.conv1.weight, embed_layer2.3.bn1.weight, embed_layer2.3.bn1.bias, embed_layer2.3.bn1.running_mean, embed_layer2.3.bn1.running_var, embed_layer2.3.conv2.weight, embed_layer2.3.bn2.weight, embed_layer2.3.bn2.bias, embed_layer2.3.bn2.running_mean, embed_layer2.3.bn2.running_var, block1.0.resblock.conv1.weight, block1.0.resblock.bn1.weight, block1.0.resblock.bn1.bias, block1.0.resblock.bn1.running_mean, block1.0.resblock.bn1.running_var, block1.0.resblock.conv2.weight, block1.0.resblock.bn2.weight, block1.0.resblock.bn2.bias, block1.0.resblock.bn2.running_mean, block1.0.resblock.bn2.running_var, block1.0.resblock.ca.fc.0.weight, block1.0.resblock.ca.fc.2.weight, block1.0.resblock.sa.conv1.weight, block1.0.concat_conv.weight, block1.1.resblock.conv1.weight, block1.1.resblock.bn1.weight, block1.1.resblock.bn1.bias, block1.1.resblock.bn1.running_mean, block1.1.resblock.bn1.running_var, block1.1.resblock.conv2.weight, block1.1.resblock.bn2.weight, block1.1.resblock.bn2.bias, block1.1.resblock.bn2.running_mean, block1.1.resblock.bn2.running_var, block1.1.resblock.ca.fc.0.weight, block1.1.resblock.ca.fc.2.weight, block1.1.resblock.sa.conv1.weight, block1.1.concat_conv.weight, block1.2.resblock.conv1.weight, block1.2.resblock.bn1.weight, block1.2.resblock.bn1.bias, block1.2.resblock.bn1.running_mean, block1.2.resblock.bn1.running_var, block1.2.resblock.conv2.weight, block1.2.resblock.bn2.weight, block1.2.resblock.bn2.bias, block1.2.resblock.bn2.running_mean, block1.2.resblock.bn2.running_var, block1.2.resblock.ca.fc.0.weight, block1.2.resblock.ca.fc.2.weight, block1.2.resblock.sa.conv1.weight, block1.2.concat_conv.weight, block2.0.resblock.conv1.weight, block2.0.resblock.bn1.weight, block2.0.resblock.bn1.bias, block2.0.resblock.bn1.running_mean, block2.0.resblock.bn1.running_var, block2.0.resblock.conv2.weight, block2.0.resblock.bn2.weight, block2.0.resblock.bn2.bias, block2.0.resblock.bn2.running_mean, block2.0.resblock.bn2.running_var, block2.0.resblock.ca.fc.0.weight, block2.0.resblock.ca.fc.2.weight, block2.0.resblock.sa.conv1.weight, block2.0.concat_conv.weight, block2.1.resblock.conv1.weight, block2.1.resblock.bn1.weight, block2.1.resblock.bn1.bias, block2.1.resblock.bn1.running_mean, block2.1.resblock.bn1.running_var, block2.1.resblock.conv2.weight, block2.1.resblock.bn2.weight, block2.1.resblock.bn2.bias, block2.1.resblock.bn2.running_mean, block2.1.resblock.bn2.running_var, block2.1.resblock.ca.fc.0.weight, block2.1.resblock.ca.fc.2.weight, block2.1.resblock.sa.conv1.weight, block2.1.concat_conv.weight, block2.2.resblock.conv1.weight, block2.2.resblock.bn1.weight, block2.2.resblock.bn1.bias, block2.2.resblock.bn1.running_mean, block2.2.resblock.bn1.running_var, block2.2.resblock.conv2.weight, block2.2.resblock.bn2.weight, block2.2.resblock.bn2.bias, block2.2.resblock.bn2.running_mean, block2.2.resblock.bn2.running_var, block2.2.resblock.ca.fc.0.weight, block2.2.resblock.ca.fc.2.weight, block2.2.resblock.sa.conv1.weight, block2.2.concat_conv.weight, block2.3.resblock.conv1.weight, block2.3.resblock.bn1.weight, block2.3.resblock.bn1.bias, block2.3.resblock.bn1.running_mean, block2.3.resblock.bn1.running_var, block2.3.resblock.conv2.weight, block2.3.resblock.bn2.weight, block2.3.resblock.bn2.bias, block2.3.resblock.bn2.running_mean, block2.3.resblock.bn2.running_var, block2.3.resblock.ca.fc.0.weight, block2.3.resblock.ca.fc.2.weight, block2.3.resblock.sa.conv1.weight, block2.3.concat_conv.weight, block3.0.resblock.conv1.weight, block3.0.resblock.bn1.weight, block3.0.resblock.bn1.bias, block3.0.resblock.bn1.running_mean, block3.0.resblock.bn1.running_var, block3.0.resblock.conv2.weight, block3.0.resblock.bn2.weight, block3.0.resblock.bn2.bias, block3.0.resblock.bn2.running_mean, block3.0.resblock.bn2.running_var, block3.0.resblock.ca.fc.0.weight, block3.0.resblock.ca.fc.2.weight, block3.0.resblock.sa.conv1.weight, block3.0.concat_conv.weight, block3.1.resblock.conv1.weight, block3.1.resblock.bn1.weight, block3.1.resblock.bn1.bias, block3.1.resblock.bn1.running_mean, block3.1.resblock.bn1.running_var, block3.1.resblock.conv2.weight, block3.1.resblock.bn2.weight, block3.1.resblock.bn2.bias, block3.1.resblock.bn2.running_mean, block3.1.resblock.bn2.running_var, block3.1.resblock.ca.fc.0.weight, block3.1.resblock.ca.fc.2.weight, block3.1.resblock.sa.conv1.weight, block3.1.concat_conv.weight, block3.2.resblock.conv1.weight, block3.2.resblock.bn1.weight, block3.2.resblock.bn1.bias, block3.2.resblock.bn1.running_mean, block3.2.resblock.bn1.running_var, block3.2.resblock.conv2.weight, block3.2.resblock.bn2.weight, block3.2.resblock.bn2.bias, block3.2.resblock.bn2.running_mean, block3.2.resblock.bn2.running_var, block3.2.resblock.ca.fc.0.weight, block3.2.resblock.ca.fc.2.weight, block3.2.resblock.sa.conv1.weight, block3.2.concat_conv.weight, block3.3.resblock.conv1.weight, block3.3.resblock.bn1.weight, block3.3.resblock.bn1.bias, block3.3.resblock.bn1.running_mean, block3.3.resblock.bn1.running_var, block3.3.resblock.conv2.weight, block3.3.resblock.bn2.weight, block3.3.resblock.bn2.bias, block3.3.resblock.bn2.running_mean, block3.3.resblock.bn2.running_var, block3.3.resblock.ca.fc.0.weight, block3.3.resblock.ca.fc.2.weight, block3.3.resblock.sa.conv1.weight, block3.3.concat_conv.weight, block3.4.resblock.conv1.weight, block3.4.resblock.bn1.weight, block3.4.resblock.bn1.bias, block3.4.resblock.bn1.running_mean, block3.4.resblock.bn1.running_var, block3.4.resblock.conv2.weight, block3.4.resblock.bn2.weight, block3.4.resblock.bn2.bias, block3.4.resblock.bn2.running_mean, block3.4.resblock.bn2.running_var, block3.4.resblock.ca.fc.0.weight, block3.4.resblock.ca.fc.2.weight, block3.4.resblock.sa.conv1.weight, block3.4.concat_conv.weight, block3.5.resblock.conv1.weight, block3.5.resblock.bn1.weight, block3.5.resblock.bn1.bias, block3.5.resblock.bn1.running_mean, block3.5.resblock.bn1.running_var, block3.5.resblock.conv2.weight, block3.5.resblock.bn2.weight, block3.5.resblock.bn2.bias, block3.5.resblock.bn2.running_mean, block3.5.resblock.bn2.running_var, block3.5.resblock.ca.fc.0.weight, block3.5.resblock.ca.fc.2.weight, block3.5.resblock.sa.conv1.weight, block3.5.concat_conv.weight, block4.0.resblock.conv1.weight, block4.0.resblock.bn1.weight, block4.0.resblock.bn1.bias, block4.0.resblock.bn1.running_mean, block4.0.resblock.bn1.running_var, block4.0.resblock.conv2.weight, block4.0.resblock.bn2.weight, block4.0.resblock.bn2.bias, block4.0.resblock.bn2.running_mean, block4.0.resblock.bn2.running_var, block4.0.resblock.ca.fc.0.weight, block4.0.resblock.ca.fc.2.weight, block4.0.resblock.sa.conv1.weight, block4.0.concat_conv.weight, block4.1.resblock.conv1.weight, block4.1.resblock.bn1.weight, block4.1.resblock.bn1.bias, block4.1.resblock.bn1.running_mean, block4.1.resblock.bn1.running_var, block4.1.resblock.conv2.weight, block4.1.resblock.bn2.weight, block4.1.resblock.bn2.bias, block4.1.resblock.bn2.running_mean, block4.1.resblock.bn2.running_var, block4.1.resblock.ca.fc.0.weight, block4.1.resblock.ca.fc.2.weight, block4.1.resblock.sa.conv1.weight, block4.1.concat_conv.weight, block4.2.resblock.conv1.weight, block4.2.resblock.bn1.weight, block4.2.resblock.bn1.bias, block4.2.resblock.bn1.running_mean, block4.2.resblock.bn1.running_var, block4.2.resblock.conv2.weight, block4.2.resblock.bn2.weight, block4.2.resblock.bn2.bias, block4.2.resblock.bn2.running_mean, block4.2.resblock.bn2.running_var, block4.2.resblock.ca.fc.0.weight, block4.2.resblock.ca.fc.2.weight, block4.2.resblock.sa.conv1.weight, block4.2.concat_conv.weight
===pretrained weight loaded===
Checkpoint loaded from ../src/pretrained/NYUv2.pt!
230620@15:25:45 | Test: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 654/654 [00:37<00:00, 17.23it/s]
Metric | RMSE: 0.09013 MAE: 0.03519 iRMSE: 0.01378 iMAE: 0.00508 REL: 0.01189 D^1: 0.99585 D^2: 0.99940 D^3: 0.99988 D102: 0.87466 D105: 0.95325 D110: 0.98049
Elapsed time : 35.93873906135559 sec, Average processing time : 0.05495220039962628 sec
So, according to my understanding, it's taking the validation h5 files (between train and val folder, total 654 files) as input each containing RGB and depth image (dimension for rgb=??, dimension for depth=??) information and outputting RMSE: 0.09013 MAE: 0.03519 iRMSE: 0.01378 iMAE: 0.00508 REL: 0.01189 D^1: 0.99585 D^2: 0.99940 D^3: 0.99988 D102: 0.87466 D105: 0.95325 D110: 0.98049
this information.
How to output the corresponding completed depth images ? I mean as per the method for each rgb+depth (h5, input) --> completed depth (h5, output) till 654 images, isn't this the process ? I can't find the output completed 654 depth images.
Thanks in advance.