Giter Site home page Giter Site logo

ngocan211 / content-based-search Goto Github PK

View Code? Open in Web Editor NEW

This project forked from dilpreetsingh/content-based-search

0.0 1.0 0.0 6.43 MB

Content-based search using a pre-trained convolutional neural network and approximate nearest neighbour lookup

License: MIT License

Python 0.38% Jupyter Notebook 99.62%

content-based-search's Introduction

Visually Similar Image Search

Finding similar images can be useful in many cases, for example one can use it to retrieve mountain photos from a ton of photos in his gallery. In this project, we aim to achieve this search by using a nearest neighbour approach over image features produced by pretrained neural networks.

Motivation

Neural networks learn to extract features from data without any explicit knowledge. Not only these features are relevant to solve given problems that they are trained for, the features might also be useful for other tasks. In this case, we use three pretrained networks, namely VGG16, ResNet152, and DenseNet. We aim to use these features as a representation of each image. We hope that visually similar images would have a similar representation. In other words, these images lay closely in this feature space.


Fig. 1: Project Overview

With this representation, it enables us to perform nearest neighbor search. We use Annoy, performing approximate nearest neightbor search.

Analysis Tool


Fig. 1: Project Overview

We have developed a website that provides an interface for exploring results from our experiments in an informative and reproducible way. If you are interested in self-exploring and digging into the results, please give it a try ๐Ÿ˜Ž.

Experiment 1: Visually Similar Artworks

In this first part, we randomly take 5000 artworks from MoMa's collection. The goal is to explore how the images of these artworks are projected onto the feature space.


Fig. 2: Visually Similar Artworks

From Fig. 2, we can see that the nearest neighbours in these feature spaces are somehow related to the given images. For example, if we look at the artwork Pettibon with Strings, similar artworks from ResNet152 and Densenet contains faces. Please explore our analysis tool for more examples.

Experiment 2: Recovery Perturbed Artworks

The goal of this experiment is to verify whether the neural network features of close images are also more or less the same. In other word, these images are semantically the same for us. As shown in Fig. 3, we use five profiles to perturb 1000 original images from MoMa's collection, producing close images for the experiment.


Fig. 3: Perturbation Profiles

Therefore, if the representation of an image and its perturbed versions are similar, we should be able to recover those perturbed images when performing nearest neighbour search.


Fig. 4: Recovery Perturbed Artworks

As shown in Fig. 4, we can see that the two original images and their perturbed versions are proximiately close in the feature spaces, particular the feature space of VGG16. For these two examples, VGG16's feature space allows us to recover 4/5 corrupted versions while the feature spaces of the other networks yield the ratio of 3/5.

With this setting, we can also quantitatively measure the performance of the results by looking at precision, recall, and f1-score.

  • Precision: no. correctly returned samples / no. returned samples
  • Recall: no. correctly returned samples / no. all relevant samples in data
  • f1-score: 2*(precision*recall)/(precision + recall)

In this case, the no. correctly returned samples is simply the number of an image's perturbed versions being returned. no. all relevant samples in data is 5 because we have five perturbation profiles, and no. returned samples is k whose values are 1, 3, 5.


Fig. 5: Averaged Precision, Recall, and F1-score

From Fig. 5, we can see that VGG16 performs quite good on average and better than the other architectures for this purpose of study. Their large variation seems to suggest that there are some cases that ResNet152 and DenseNet can embed close images to near locations in the feature spaces. This might be a good further analysis.

Future Work

  • Try with more samples. Maybe 10,000 artworks?
  • Use features from autoencoders (vanila, VAE)
  • Train autoencoders with the following scheme: perturbed image -> autoencoder -> original image.

Development

Please refer to DEVELOPMENT.md.

Acknowledgements

content-based-search's People

Contributors

p16i avatar dilpreetsingh avatar dependabot[bot] avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.