Giter Site home page Giter Site logo

shiaoming / alike Goto Github PK

View Code? Open in Web Editor NEW
297.0 9.0 35.0 30.66 MB

ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor Extraction

Home Page: https://arxiv.org/pdf/2112.02906.pdf

License: BSD 3-Clause "New" or "Revised" License

Python 93.05% MATLAB 6.95%

alike's Introduction

News

ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor Extraction

ALIKE applies a differentiable keypoint detection module to detect accurate sub-pixel keypoints. The network can run at 95 frames per second for 640 x 480 images on NVIDIA Titan X (Pascal) GPU and achieve equivalent performance with the state-of-the-arts. ALIKE benefits real-time applications in resource-limited platforms/devices. Technical details are described in this paper.

Xiaoming Zhao, Xingming Wu, Jinyu Miao, Weihai Chen, Peter C. Y. Chen, Zhengguo Li, "ALIKE: Accurate and Lightweight Keypoint
Detection and Descriptor Extraction," IEEE Transactions on Multimedia, 2022.

If you use ALIKE in an academic work, please cite:

@article{Zhao2023ALIKED,
    title = {ALIKED: A Lighter Keypoint and Descriptor Extraction Network via Deformable Transformation},
    url = {https://arxiv.org/pdf/2304.03608.pdf},
    doi = {10.1109/TIM.2023.3271000},
    journal = {IEEE Transactions on Instrumentation & Measurement},
    author = {Zhao, Xiaoming and Wu, Xingming and Chen, Weihai and Chen, Peter C. Y. and Xu, Qingsong and Li, Zhengguo},
    year = {2023},
    volume = {72},
    pages = {1-16},
}

@article{Zhao2022ALIKE,
    title = {ALIKE: Accurate and Lightweight Keypoint Detection and Descriptor Extraction},
    url = {http://arxiv.org/abs/2112.02906},
    doi = {10.1109/TMM.2022.3155927},
    journal = {IEEE Transactions on Multimedia},
    author = {Zhao, Xiaoming and Wu, Xingming and Miao, Jinyu and Chen, Weihai and Chen, Peter C. Y. and Li, Zhengguo},
    month = march,
    year = {2022},
}

1. Prerequisites

The required packages are listed in the requirements.txt :

pip install -r requirements.txt

2. Models

The off-the-shelf weights of four variant ALIKE models are provided in models/ .

3. Run demo

$ python demo.py -h
usage: demo.py [-h] [--model {alike-t,alike-s,alike-n,alike-l}]
               [--device DEVICE] [--top_k TOP_K] [--scores_th SCORES_TH]
               [--n_limit N_LIMIT] [--no_display] [--no_sub_pixel]
               input

ALike Demo.

positional arguments:
  input                 Image directory or movie file or "camera0" (for
                        webcam0).

optional arguments:
  -h, --help            show this help message and exit
  --model {alike-t,alike-s,alike-n,alike-l}
                        The model configuration
  --device DEVICE       Running device (default: cuda).
  --top_k TOP_K         Detect top K keypoints. -1 for threshold based mode,
                        >0 for top K mode. (default: -1)
  --scores_th SCORES_TH
                        Detector score threshold (default: 0.2).
  --n_limit N_LIMIT     Maximum number of keypoints to be detected (default:
                        5000).
  --no_display          Do not display images to screen. Useful if running
                        remotely (default: False).
  --no_sub_pixel        Do not detect sub-pixel keypoints (default: False).

4. Examples

KITTI example

python demo.py assets/kitti 

TUM example

python demo.py assets/tum 

5. Efficiency and performance

Models Parameters GFLOPs(640x480) MHA@3 on Hpatches mAA(10°) on IMW2020-test (Stereo)
D2-Net(MS) 7653KB 889.40 38.33% 12.27%
LF-Net(MS) 2642KB 24.37 57.78% 23.44%
SuperPoint 1301KB 26.11 70.19% 28.97%
R2D2(MS) 484KB 464.55 71.48% 39.02%
ASLFeat(MS) 823KB 77.58 73.52% 33.65%
DISK 1092KB 98.97 70.56% 51.22%
ALike-N 318KB 7.909 75.74% 47.18%
ALike-L 653KB 19.685 76.85% 49.58%

Evaluation on Hpatches

  • Download hpatches-sequences-release and put it into hseq/hpatches-sequences-release.
  • Remove the unreliable sequences as D2-Net.
  • Run the following command to evaluate the performance:
    python hseq/eval.py

For more details, please refer to the paper.

alike's People

Contributors

shiaoming avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

alike's Issues

train

I wanna know the reason when I use the model trained by the train code, the error happen.
2023-06-30 10-07-45屏幕截图
Thanks!!

About aachen day-night dataset evaluation results,

I really appreciate your wonderful work and inference code.

While reading your paper, I had a some questions about details.

Is the visual localization of aachen dataset results on the paper only for night, right?

and in case of the unlimited keypoints, which number did you set for the maximum keypoints number?

8000 or more than 8000 ?

Thank you.

MISSING JSON FILE

When I was training the network, I encountered an error indicating a missing file.
image

Export the model to ONNX

Hi author,

Thank you for your amazing work.

I'm trying to export the ALIKE models to ONNX for inferencing on edge-devices. However, the some operation in the model are not supported to convert to ONNX, including : grid_sampler, argsort.

Have you tried converting the model to ONNX before, and was it successful, if yes, could you please guide me to export to ONNX ? Many thanks in advance !

Best regards,
Michael

Repeatable loss, is exp(), or softmax()

Thank you for great job!!!
i have a small question about repeatable loss.In ALIKE paper, the matching probability map is obtained by normalizing the similarity map.but use exp() while in decriptor loss use softmax().
it is a careless mistake?or i misunderstood.
Thanks

Training code

Hi Shiaoming,
Did you plan to release the training code ?
I read this, 'We will publish the training code for ALIKED (the subsequent work of ALIKE, which we are working on). Then you can refer to the training code of ALIKED, they have the same training pipeline.'.
But I see nothing on ALIKED repository ?

Thank you.

Reprojection loss

In your paper III-C1:
For a warped keypoint p_AB, we find its closest detected keypoint p_B within th_gt pixels distance as its corresponding keypoint.

  • What is the th_gt pixels distance in reprojection loss?
  • Do you calculate the distance of keypoints on the score map before DKD?

I'm very sorry, I have some doubts about your convolutional neural network framework

In order to describe accurately, I want to express it in Chinese.
我对此了解不是十分深刻,但对您文章中提出的训练模式与导数反向传播方式存在一定疑问。文章中使用了NMS层导致整体网络框架不可导,您利用了点邻域的score map构建可导的模式,相当于将一整张图片分割出许多小patch,仅对这些patch做了优化。可是一开始确定NMS点位置的操作是不是没有经过训练优化。您是怎么保证一开始的nms点位置是正确的呢

repeatability issue

I have trained ALIKE and the results are quite Good. But I noticed the val_peatability score decrease. Do you have any comment why? Does other loss, like reliability loss will harm the repeatability ability?
image

Thanks in advance.

Training code

This is a very interesting work, that achieves some impressive results. I am very interested in adapting ALIKE to my needs. Would you be willing to share/publish the code used for training the network?

about reprojection error

Is the model trained with reprojection error works better in Hpatches dataset?such as localization error Anyone tried?
please!!!

eval

when i run hesq/eval.py, i got all MMA@3 >MMA@5.
is the code wrong?

and

for code for thr in range(1, 4):
thr should output 1,2,3, not 1,3,5

Format conversion

Hello, author! The model saved by your training code is in ckpt format, and the test code is in pth format. May I ask how this ckpt is converted to pth?

Reproduce results

Upon examination of your code, I noticed the existence of a parameter named "sub_pixel," which is set to False by default. Upon further investigation, I discovered that enabling this parameter by setting it to True can result in improved performance. I am interested in understanding whether there are any potential drawbacks to enabling this parameter during inference. Furthermore, I would like to inquire about the rationale behind disabling this parameter by default.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.