Giter Site home page Giter Site logo

stemmr / keras_triplet_descriptor Goto Github PK

View Code? Open in Web Editor NEW

This project forked from matchlab-imperial/keras_triplet_descriptor

0.0 1.0 1.0 149.91 MB

Baseline to denoise + learn descriptors in N-HPatches

Jupyter Notebook 74.72% Python 25.28%

keras_triplet_descriptor's Introduction

N-HPatches Baseline Code

This repository contains the baseline code for the Deep Learning Coursework Project (EE3-25).

The project aims to create a learned descriptor that is able to perform matching, verification and retrieval tasks successfully on N-HPatches. N-HPatches is a noisy version of HPatches dataset.

We will keep updating the repository, with more detailed explanations and improved code. Please, be aware of the version you are using and keep track of future changes.

HPatches Dataset

HPatches is based on 116 sequences, 57 sequences presenting photometric changes, while the other 59 sequences show geometric deformations due to viewpoint change. A sequence includes a reference image and 5 target images each one with varying photometric or geometric changes. Homographies relating reference images with target images are provided.

Patches are sampled in the reference image using a combination of local feature extractors (Hessian, Harris and DoG detector). The patch orientation is estimated using a single major orientation using Lowe's method. No affine adaptation is used, therefore all patches are square regions in the reference image. Afterward, patches are projected on the target images using the ground-truth homographies. Hence, a set of corresponding patches contains one patch from each image in the sequence.

In practice, when a detector extracts corresponding regions in different images, it does so with a certain amount of noise. In order to simulate this noise, detections are perturbed using three settings: EASY, HARD and TOUGH, each one created by increasing the geometric transformation applied, resulting in increased detector noise. In other words, as bigger is the geometric transformation, harder will be to solve the task at hand.

Following images show the reprojected easy/hard patches in the target images together with the extracted patches, which can be found in the original github repository:


Image 1: Visualization of the easy patches locations in the target images.


Image 2: Extracted easy patches from the example sequence.


Image 3: Visualization of the hard patches locations in the target images.


Image 4: Extracted hard patches from the example sequence.

You can find more details on original HPatches dataset here.

N-HPatches Dataset

N-HPatches dataset is a noisy version of HPatches. Each sequence, in addition to original patches, contains a noisy version for each one of them. Different noises are added depending on the three settings described above: EASY, HARD and TOUGH, from low to high noises.

Patches are downsampled from original size, 65x65, to 32x32. We will test performance of the descriptor in noisy patches, however, clean patches can be used in the training stage.

N-HPatches Dataset can be found here. Data structure is the same as in HPatches, therefore, please refer to original paper for extra details.

Baseline approach

This repository contains a baseline pipeline for the N-HPatches descriptor project.

The pipeline followed is based on two consecutive networks. The first network is in charge of getting a cleaner version of the noisy input patch, while the next network will get the final descriptor.

The architectures that have been chosen as a first approach is a shallow UNet for the denoising part, and L2-Net architecture for the descriptor.

In order to train the denoising model, the baseline code uses the Mean Absolute Error (MAE) as a loss function between the output of the network for a noisy patch and its corresponding cleaned patch.

On the other hand, baseline code trains the descriptor based on the Triplet loss, which takes an anchor patch, a negative patch and a positive patch. The idea is to train the network so the descriptors from the anchor and positive patch have a low distance between them, and the negative and anchor patch has a large distance between them. In order to do so, the code generates three instances of the network (sharing the weights) and the training triplets. Further architectures or loss functions could be used in order to improve both steps separately, or even, merge them and optimize them together.


Image 5: Pipeline in inference time.


Evaluation metrics

We use the mean average precision (mAP) on three different tasks: Patch Verification, Image Matching and Patch Retrieval, to evaluate the descriptors. Those tasks have been designed to imitate typical use cases of local descriptors. The final score is the mean of the mAPs on all tasks. To learn more details on how the evaluation is computed, refer to the HPatches benchmark.

Patch Verification measures the ability of a descriptor to classify whether two patches are extracted from the same measurement.

Image Matching tests to what extent a descriptor can correctly identify correspondences in two images. To evaluate the image matching,

Patch Retrieval tests how well a descriptor can match a query patch to a pool of patches extracted from many images, including many distractors.

keras_triplet_descriptor's People

Contributors

alopezgit avatar axelbarroso avatar stemmr avatar

Watchers

James Cloos avatar

Forkers

mariascross

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.