Giter Site home page Giter Site logo

getcomputerscience / pdqcontainer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ailecs/pdqcontainer

0.0 0.0 0.0 31.71 MB

Python based Docker container exposing PDQ hasher as a RESTful API (PDQ by @facebook).

License: MIT License

Dockerfile 13.50% Python 86.50%

pdqcontainer's Introduction

PDQ Container

Description

This project hosts an implementation of Facebook's PDQ, a perceptual hashing algorithm used for measuring image similarity. A summary of the algorithm is available here.

Please note this is a reference implementation, and is not recommended for use in production systems.


Setup

Clone/download this project, and build the image:

foo@bar:/foo/PDQContainer/$ docker build -t ailecs/pdqhasher:latest .

This can take a while (bandwidth/CPU depending - YMMV), because it installs build-essential and other large(ish) packages, plus cloning and making the PDQ binaries.

-t ailecs/pdqhasher:latest is optional, but does make the next step easier.

Once built, the image can be containerised and run using

foo@bar:~/$ docker run  -p 8080:8080 ailecs/pdqhasher:latest

If you didn't use -t, you'll need to work out the image ID. This can be done using

foo@bar:~/$ docker image ls

Note: This project comes with an ignorable hashset (based on a reduced subset of Google Open Images) as an example and also for your own use. You may find it easier to mount a local directory of hash files and map them to the container thus:

foo@bar:~/$ docker run  -p 8080:8080 -v /path/to/your/hashsets:/app/python/hashsets ailecs/pdqhasher:latest

Usage

The container exposes a RESTful API, with swagger 2.0 compliant documentation. This can be accessed via http://localhost:8080/ui . We recommend using this documentation for learning the API.

To hash a file: Post a file (multipart - named file_to_upload to http://localhost:8080/pdq/hash . The hash (encoded as a hex string) is returned.

To search for near matches to a file: Post a file (multipart - named file_to_upload to http://localhost:8080/pdq . An array of matches (refer swagger doc for specifications) is returned.


Licensing

This is released under an MIT licence. External dependencies for the software in this project may be subject to more restrictive arrangements.


Notes

The service utilises MIH for accelerating lookups within the configured hamming distance. Otherwise, it uses linear search. You will encounter performance drops with linear search, particularly as your dataset grows!

pdqcontainer's People

Contributors

jdalins avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.