parskatt / dkm Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2023] DKM: Dense Kernelized Feature Matching for Geometry Estimation
Home Page: https://parskatt.github.io/DKM/
License: Other
[CVPR 2023] DKM: Dense Kernelized Feature Matching for Geometry Estimation
Home Page: https://parskatt.github.io/DKM/
License: Other
Thanks for your great work and impressive results!
Since your results are so impressive, it is straightforward to consider combing your work as part of the downstream task, like visual odometry. So I am curious about the inference time for DKM compared to other methods, like LoftR or optical flow method RAFT. Have you done any experiments like this? I haven't seen any report about inference time in paper, so it will be so nice for you to help.
Hello @Parskatt ,
Congratulations on your paper acceptance! :D
I want to know the details about how many GPUs you use, the capacity memory of each, and how much GPU memory is consumed during your training.
Thank you!!!
Hi,
I have known camera extrinsics/intrinsics and I like to 3d project the pixels of 2 images matched by DKM. What would be a good point to get the matches between the images?
Thanks!
Daniel
Congratulations on the great work!
Do you have plans for releasing the training code ? I'm thinking on fine-tuning your weights for more specific tasks.
What is the difference between DKMv3 and DKMv2?
Could you teach me how to run demo_match.py without GPUs ?
It seems that dkm_model.match doesn't accept batch input now?
Thanks for your amazing work!
While trying to utilize this amazing work, I can not understand on implementation in the code.
In funciton "upsample_preds" (used only in function match), the final flow is further refined using "self.conv_refiner", however, the estimated residual flow is not added over the original final flow. Instead, a re-sampling is executed.
query_coords = torch.meshgrid((
torch.linspace(-1 + 1 / h, 1 - 1 / h, h, device="cuda:0"),
torch.linspace(-1 + 1 / w, 1 - 1 / w, w, device="cuda:0"),
))
query_coords = torch.stack((query_coords[1], query_coords[0]))
query_coords = query_coords[None].expand(b, 2, h, w)
delta_certainty, delta_flow = self.conv_refiner["1"](query, support, flow)
displaced_query_coords = torch.stack((
query_coords[:, 0] + delta_flow[:, 0] / (4 * w),
query_coords[:, 1] + delta_flow[:, 1] / (4 * h),
),dim=1,)
flow = F.grid_sample(flow, displaced_query_coords.permute(0, 2, 3, 1))
Hi Johan, I really appreciate your great work.you have provided the pretrained model with resnet50.I noticed from the code that you also have a model for resnet18,Is it possible to provide the pretrained model with resnet18? Thank you very much!
Hello, may I ask if it's possible to use DKMv3 for training when there is no depth information in my own dataset? I only need to obtain the horizontal and vertical offsets for perspective transformation.
Why does PDC-net perform reasonably well in pck but much worse in two-view geometry estimation?
pdc-net pck:
Is it because the confidence it predicts is learned self-supervised? Also is it fair to compare DKM with pdc-net without retraining it fully supervised?
If I understand it wrong, please point it out, Thank you! : )
Thanks for releasing such a great package.
When I import dkm at my custom script, just importing makes additional memory occupancy on gpu 0.
Moving the following line to tensor_to_pil() seems to fix this.
imagenet_mean = torch.tensor([0.485, 0.456, 0.406]).to(device)
imagenet_std = torch.tensor([0.229, 0.224, 0.225]).to(device)
(https://github.com/Parskatt/DKM/blob/main/dkm/utils/utils.py#L294)
Could you check?
Hi,
I have quickly checked the possibility to strip out training code and integrate the bare DKM into kornia.
It seems that the main blocker is numpy.random.choice
Line 585 in b3311e2
And the Pytorch doesn't have it
pytorch/pytorch#16897
Is there a way we can use existing pytorch functionality to get a similar thing?
In DKM's sample method, do you intend to set any certainty greater than 0.05 to 1, instead of, perhaps, setting any certainty less than 0.05 to 0?
Line 574 in fb85d63
I ask, because you then go on to sample from the matches based on that certainty which can result in points with certainty less than 0.05 being sampled. In fact, if all the matches have certainty below 0.05, then expansion_factor*num
matches would still be selected.
Hello. Not completely sure if this is an issue, but I am a bit confused about what the conf_thresh
parameter should be used for. I'm referring to this line in dkm.py. Shouldn't this line come after the filtering with relative_confidence_threshold
? Is it correct to say that conf_thresh
filters out outliers and the inliers are given the same probability?
Hi!
While trying to rerpduce the results using Mega + Synthetic Dataset using "train_mega_synthetic.py", I notice that in the training code, model is set to be DKM (version 1), in Line 31-33. Does this indicate the training scripts "train_mega_synthetic.py" is designed for DKM (v1). Or the scripts are suitable for both versions?
Thanks!
LoFTR uses only 15,300=153*100 image training pairs from MegaDepth.
While DKM iteratively sample 150,000 pairs each iteration from 10,661,614 total traning pairs in 53 iterations.
When using
dense_matches, dense_certainty = model.match(img1PIL, img2PIL, check_cycle_consistency=False, do_pred_in_og_res=True)
I got:
738 if do_pred_in_og_res: # Will assume that there is no batching going on.
739 og_query, og_support = self.og_transforms((im1, im2))
--> 740 query_to_support, dense_certainty = self.matcher.upsample_preds(
741 query_to_support,
742 dense_certainty,
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
1129 return modules[name]
1130 raise AttributeError("'{}' object has no attribute '{}'".format(
-> 1131 type(self).__name__, name))
1132
1133 def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:
AttributeError: 'RegressionMatcher' object has no attribute 'matcher'
I belive you need to rename the variable self.matcher
to self.decoder
in line 740 of dkm.py (click here to see)
Hi Johan,
Thanks for your great contribution,
I noticed that you used Gaussian processes to encode feature maps in the global matcher. We find this approach very novel and completely different from the global 4D-correlation volume used in previous methods.
We wondered what motivates you to use Gaussian processes to model this, and why is the Gaussian process suitable for solving this warp mapping prediction problem?
Best wishes,
Weiguang Zhang
I am performing an image matching test on demo_match.py. What is the structure of the warp obtained in this code ‘warp, certainty = dkm_model.match(im1_path, im2_path, device=device)’, and what is the information stored in it?I want to get the pixel matching relationship of two pictures through ‘warp’, how should I do it?Thank you very much!
Great work!
I wonder what is the effect of
"
low_res_certainty = factorlow_res_certainty(low_res_certainty < cert_clamp)
...
dense_certainty = dense_certainty - low_res_certainty
" in dkm/models/dkm.py?
Also, shouldn't this
"
query_coords = torch.meshgrid(
(
torch.linspace(-1 + 1 / hs, 1 - 1 / hs, hs, device=device),
torch.linspace(-1 + 1 / ws, 1 - 1 / ws, ws, device=device),
)
)
" be
"
query_coords = torch.meshgrid(
(
torch.linspace(-1 + 1 /( 2hs), 1 - 1 /(2 hs), hs, device=device),
torch.linspace(-1 + 1 /( 2* w)s, 1 - 1 /( 2* ws), ws, device=device),
)
)
"?
Thanks!
Congratulations on this amazing result. Where can I read your latest papers?
Like SuperGlue and LoFTR did
Hi Johan, I really appreciate your great work. Is it possible to provide the pretrained model here? Thank you very much!
Dear @Parskatt, dear authors, thank you for this repo! Great work!
Could you please clarify on the license of the provided DKMv3 model's weights for indoor and outdoor? Were they trained on datasets that imply open license?
Thank you in advance!
Hey!
First of all great work, I was wondering if we can get a match score of 2 images from this model?
I noticed that you use GC-RANSAC for pose estimation. I tried to look into the GC-RANSAC repository but couldn't get it to work. I was hoping to take a look at your code and inquire about which version of GC-RANSAC you are using?
Thanks for your excellent work.
I noticed a confusing detail. When you load images from Megadepth dataset, you resize the image to a fixed ht
and wt
, that will change the original aspect ratio(some images may have been taken vertically). I'm wondering why you do not maintain the aspect ratio and pad the image to the specified size, which is a common practice in other computer vision tasks. Is it because the padded areas significantly interfere with the estimation of the warp? If so, would masking out the warp generated by the padded areas be a good solution?
Thanks again!
Can you post that part of the code that works on the synthetic dataset? I want to try it on cross-spectrum
Hi Johan, thanks for your excellent work!
I wonder why np.random.choice is used for selecting good_samples instead of ordering the confidences. Is it for selecting more sparse keypoints?
Thanks for your excellent work! May I ask for one possible solution for the problem shown as below? Thank you so much!
Traceback (most recent call last):
File "experiments/dkm/train_DKMv3_outdoor.py", line 259, in
train(args)
File "experiments/dkm/train_DKMv3_outdoor.py", line 250, in train
wandb.log(megadense_benchmark.benchmark(model))
File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/benchmarks/megadepth_dense_benchmark.py", line 72, in benchmark
matches, certainty = model.match(im1, im2, batched=True)
File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/models/dkm.py", line 695, in match
dense_corresps = self.forward(batch, batched = True)
File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/models/dkm.py", line 631, in forward
dense_corresps = self.decoder(f_q_pyramid, f_s_pyramid)
File "/mnt/data-disk-1/home/cpii.local/wtwang/miniconda3/envs/im/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/models/dkm.py", line 494, in forward
new_stuff = self.gps[new_scale](f1_s, f2_s, dense_flow=dense_flow)
File "/mnt/data-disk-1/home/cpii.local/wtwang/miniconda3/envs/im/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/data-disk-1/home/cpii.local/wtwang/IM/codes/DKM/dkm/models/dkm.py", line 360, in forward
K_yy_inv = torch.linalg.inv(K_yy + sigma_noise)
torch._C._LinAlgError: linalg.inv: (Batch element 0): The diagonal element 512 is zero, the inversion could not be completed because the input matrix is singular.
Hi, your great work is really amazing ! Can't wait for full release demo of DKMv3 . When do you intend to do that ?
Hi Johan,
When I train about 160k steps, I got the checkpoint for testing, and then obtained different testing results. For example, 1st testing result: 58.345 (auc5), 2nd testing result: 59.449, 3th testing result: 59.853.
I think this is caused by the randomness of sampling. Will this randomness be alleviated as the number of iterations increases?
Thank you so much for your help!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.