Giter Site home page Giter Site logo

gmberton / geo_warp Goto Github PK

View Code? Open in Web Editor NEW
57.0 6.0 2.0 695 KB

Official repository of ICCV21 paper "Viewpoint Invariant Dense Matching for Visual Geolocalization"

License: MIT License

Python 100.00%
computer-vision geo-localization geolocalization image-retrieval pytorch re-ranking reranking visual-geo-localization visual-place-recognition

geo_warp's Introduction

Viewpoint Invariant Dense Matching for Visual Geolocalization: Official PyTorch implementation

This is the official implementation of the ICCV 2021 paper:

G Berton, C. Masone, V. Paolicelli and B. Caputo, Viewpoint Invariant Dense Matching for Visual Geolocalization

[ICCV OpenAccess] [ArXiv] [Video] [BibTex]

Setup

First download the baseline models which have been trained following the training procedure in the NetVLAD paper. We provide a script to download the six models used, which are a combination of 3 backbone encoders (AlexNet, VGG-16 and ResNet-50) with 2 pooling/aggregation layers (GeM and NetVLAD). The models are automatically saved in data/pretrained_baselines.

python download_pretrained_baselines.py

Then you should prepare your geo-localization dataset, so that the directory tree is as such:

dataset_name
└── images
    ├── train
    │   ├── gallery
    │   └── queries
    ├── val
    │   ├── gallery
    │   └── queries
    └── test
        ├── gallery
        └── queries

and the images are named as @UTM east@UTM north@[email protected]

Dependencies

See requirements.txt

Training

You can train the model using the train.py, here's an example with the lightest/fastest model (i.e. AlexNet + GeM):

python train.py --arch alexnet --pooling gem --resume_fe data/pretrained_baselines/alexnet_gem.pth

For a full set of options, and explanation of the parameters, run python train.py -h. The script will create a folder under ./runs/default/YYYY-MM-DD_HH-mm-ss where logs and checkpoints will be saved. At the end of the training you will see the results with the baseline model, as well as when re-ranking is applied using GeoWarp.

Evaluation

You can use this code to compute the results with our trained models. To reproduce the results from the paper, you can download our models simply running

python download_trained_hom_reg.py

which will automatically download the models and save them under data/trained_homography_regressions. Then to obtain the results you can execute

python eval.py --arch alexnet --pooling gem --resume_fe data/pretrained_baselines/alexnet_gem.pth --resume_hr data/trained_homography_regressions/alexnet_gem.pth

This will give you the exact same results as in Table 1 of the paper. For a full set of options, and explanation of the parameters, run python eval.py -h.

Visualization of self-supervised data

You can generate and visualize self-supervised data given a single image, simply running

python visualize_ss_data.py --image_path data/example.jpg --k 0.8

The script generates four images (notation is consistent with the paper):

  1. ./data/ss_img_source.jpg: the source image I, with the visualization of the two quadrilaterals tx (orange) and ty (purple) and their intersection tz (green) as defined in the paper;
  2. ./data/ss_proj_a.jpg: the first projection Ia, with the projection ta of the intersection (green);
  3. ./data/ss_proj_b.jpg: the second projection Ib, with the projection tb of the intersection (green);
  4. ./data/ss_proj_intersection.jpg: the projection of the intersection.

You can change the value of k to see how this influences the training data.

Example of randomly generated images:

Source image Projection A Projection B Projected intersection

BibTeX

If you use this code in your project, please cite us using:

@InProceedings{Berton_ICCV_2021,
    author    = {Berton, Gabriele and Masone, Carlo and Paolicelli, Valerio and Caputo, Barbara},
    title     = {Viewpoint Invariant Dense Matching for Visual Geolocalization},
    booktitle = ICCV,
    month     = {October},
    year      = {2021},
    pages     = {12169-12178}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.