Giter Site home page Giter Site logo

a2pm-mesa's Introduction

A2PM-MESA

The family of Area to Point Matching. A2PM

This is a user-friendly implementation of Area to Point Matching (A2PM) framework, powered by hydra.

It contains the implementation of SGAM (arXiv'23), a training-free version of MESA (CVPR'24) and DMESA (arXiv'24).

Due to the power of hydra, the implementation is highly configurable and easy to extend.

It supports the implementation of feature matching approaches adopting the A2PM framework, and also enables the combination of new point matching and area matching methods.

Qualitative Results of MESA and DMESA

Qua


Installation

To begin with, you need to install the dependencies following the instructions below.

Environment Creation

  • We recommend using conda to create a new environment for this project.
conda create -n A2PM python==3.8
conda activate A2PM

Basic Dependencies

  • Install torch, torchvision, and torchaudio by running the following command.

    • Note the requirements of the torch version is soft, and the code is tested on torch==2.0.0+cu118.
    pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1 -index-url https://download.pytorch.org/whl/cu118
  • Install the basic dependencies by running the following command.

    pip install -r requirements.txt

Usage: hydra-based Configuration

This code is based on hydra, which is a powerful configuration system for Python applications. The documentation of hydra can be found here.

In the following, we will introduce how to use the code by describing its components with hydra configurations.

Dataset

We offer dataloaders for two widely-used datasets, including ScanNet1500 and MegaDepth1500.

  • You can follow the instructions in LoFTR to download the datasets and set the paths in the configuration files in conf/dataset/.

  • As our methods rely on the segmentation results, whose paths are also needed to be set in the configuration files in conf/dataset/. For example,

    • sem_mode: the segmentation method used, including SAM, SEEM, GT
    • sem_folder: path to the folder containing the segmentation results
    • sem_post: the file format of the segmentation results (For ScanNet, it is npy if using SAM, and png if using SEEM or GT)
  • The segmentation process will be discussed in the following section.

  • More datasets can be easily added by adding new dataloaders in dataloader/ and setting the corresponding configurations in conf/dataset/.

Segmentation Preprocessing

The segmentation results are needed for the area matching methods.

  • To use Segment Anything Model (SAM) for segmentation, we provide our inference code in segmentor/. To use it, you need to:
    • clone the SAM repository and put it into the segmentor/.. folder, corresponding to the path set in the segmentor/SAMSeger.py - L23.
    • intall the dependencies in the SAM repository.
    • set the pre-trained model path in the configuration dict in segmentor/ImgSAMSeg.py - L34.

Usage

  • See the segmentor/sam_seg.sh for the usage of the SAM segmentation code.

Area Matching

Area matching is to establish semantic area matches between two images for matching reduandancy reduction, which is the core of the A2PM framework.

  • We provide three area matchers, including:
    • Semantic Area Matching(our method in SGAM)

      • The implementation in area_matchers/sem_am.py.
      • The configuration in conf/area_matcher/sem_area_matcher.yaml.
      • See also the repository SGAM-code.
    • MESA-free

      • A training-free version of MESA.
      • The implementation in area_matchers/mesa.py.
      • The configuration is in conf/area_matcher/mesa-f.yaml.
      • The MESA-free eliminates the need for training the area similarity calculation module in MESA, and directly uses the off-the-shelf patch matching in ASpanFormer like DMESA.
      • The MESA-free is easier to use, but its performance is slightly lower than the original MESA.
    • DMESA

      • A dense counterpart of MESA proposed in paper, more einfficient and flexible.
      • The implementation in area_matchers/dmesa.py.
      • The configuration is in conf/area_matcher/dmesa.yaml.

Point Matching

Point matching is to establish point matches between two (area) images.

  • Here, we provide four point matchers, including:

  • Their configurations are put in conf/point_matcher/, with warppers in point_matchers/.

  • For some of them, the inside paths need to be modified, which we have fixed in the submodules in point_matchers/.

  • Before running, you need download the pre-trained models and put them in the corresponding paths in the configuration yaml files.

  • More point matchers can be easily added by adding simialr warppers.

Match Fusion (Geometry Area Matching)

We fuse matches from multiple inside-area point matching by the geometry area matching module.

  • The configuration is in conf/geo_area_matcher/.

  • We provide two fusion methods, including:

    • original GAM proposed in SGAM
      • code in geo_area_matcher/gam.py
      • configuration in conf/geo_area_matcher/gam.yaml
    • a more effective GAM
      • code in geo_area_matcher/egam.py
      • configuration in conf/geo_area_matcher/egam.yaml

A2PM

The A2PM framework will combine the above components to form a complete feature matching pipeline.

  • The implementation is in scripts/test_a2pm.py. You can run the shell script scripts/test_in_dev.sh to test the A2PM framework on a pair of images.

  • The pipeline configuration is set in conf/experiment/*.yaml. You can choose the one you want to use by setting the +experiment=xxx.yaml in the shell script.

Evaluation

  • We provide the evaluation code in metric/. The evaluation metrics include:

    • Pose estimation AUC
    • Mean Matching Accuracy
    • Area Matching Accuracy
  • The metric/instance_eval.py is used to evaluate the instance-level matching results. It is used in test_a2pm.py.

  • The metric/eval_ratios.py is used to evaluate the batch-level matching results. Set the paths in the py file and run it to get the evaluation results.


Benchmark Test

You can run the benchmark test by running the shell script such as:

./scripts/dmesa-dkm-md.sh # DMESA+DKM on MegaDepth1500

You can change the configurations in the shell script to test different methods, i.e. +experiment=xxx.

Expected Results of provided scripts

Take DKM as an example, the expected results are as follows:

SN1500($640\times480$) DKM MESA-free+DKM DMESA+DKM
Pose AUC@5 30.26 31.64 30.96
Pose AUC@10 51.51 52.80 52.41
Pose AUC@20 69.43 70.08 69.74
MD1500($832\times832$) DKM MESA-free+DKM DMESA+DKM
Pose AUC@5 63.61 63.85 65.65
Pose AUC@10 76.75 77.38 78.46
Pose AUC@20 85.72 86.47 86.97
  • In this evaluation code, we fix the random seed to '2' (see scripts/test_a2pm.py), which is different from the settings in our paper (without fixing the random seed). Thus, the results are slightly different from the results in the paper, but the effectiveness of our methods is consistent.

  • Also, the parameters in the configuration files are set to the default values. Due to the complexity of the parameter settings, we have not tuned the parameters for the best performance.

    • Better results can be achieved by tuning the parameters for specific datasets and tasks.
    • However, the default parameters are enough to show the effectiveness of our methods.

Citation

If you find this work useful, please consider citing:

@article{SGAM,
  title={Searching from Area to Point: A Hierarchical Framework for Semantic-Geometric Combined Feature Matching},
  author={Zhang, Yesheng and Zhao, Xu and Qian, Dahong},
  journal={arXiv preprint arXiv:2305.00194},
  year={2023}
}
@InProceedings{MESA,
    author    = {Zhang, Yesheng and Zhao, Xu},
    title     = {MESA: Matching Everything by Segmenting Anything},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {20217-20226}
}
@misc{DMESA,
    title={DMESA: Densely Matching Everything by Segmenting Anything},
    author={Yesheng Zhang and Xu Zhao},
    year={2024},
    eprint={2408.00279},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Acknowledgement

We thank the authors of the following repositories for their great works:

a2pm-mesa's People

Contributors

easonyesheng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

clarkren kongan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.