Giter Site home page Giter Site logo

wagtail-clip's Introduction

Wagtail CLIP

Wagtail CLIP allows you to search your Wagtail images using natural language queries.

Intro Video

This project was inspired by, and draws heavily from memery, by deepfates et al. It makes use of OpenAI's CLIP model (it was very nice of them to open source it, cheers).

An example project is available here

Installation

You can install this package as follows (requires Git):

pip install \
    wagtailclip@git+https://github.com/MattSegal/wagtail-clip.git \
    -f https://download.pytorch.org/whl/torch_stable.html \
    torch==1.7.1+cpu \
    torchvision==0.8.2+cpu

You will find that this installs ~200MB of deep learning libraries (PyTorch). You will also need to update your Django project's settings:

INSTALLED_APPS = [
    # ... whatever ...
    "wagtailclip",
]

# A place to store ~330MB of downloaded model parameters
WAGTAIL_CLIP_DOWNLOAD_PATH = "/clip"
# Maximum number of search results
WAGTAIL_CLIP_MAX_IMAGE_SEARCH_RESULTS = 256
# A unique name for the search backend.
WAGTAIL_CLIP_SEARCH_BACKEND_NAME = "clip"
# Recommended model, or your can roll your own (read the source).
WAGTAILIMAGES_IMAGE_MODEL = "wagtailclip.NaturalSearchImage"
# Add the search backend.
WAGTAILSEARCH_BACKENDS = {
    # ... whatever ...
    WAGTAIL_CLIP_SEARCH_BACKEND_NAME: {
        "BACKEND": "wagtailclip.search.CLIPSearchBackend",
    },
}

That's enough to get started, however if you want pre-download the ~330MB of model parameters, you can run this management command:

./manage.py download_clip

How it works

This package wraps the CLIP model. which can be used for:

  • encoding text into 1x512 float vectors
  • encoding images into 1x512 float vectors

These vectors can be thought of as points in a 512 dimensional space, where the closer two points are to each other, the more "related" they are. Importantly, CLIP encodes both text and images into the same space, meaning that we can:

  • encode all Wagtail images into vectors and store them in the database
  • encode a user's search query text into a vector; and then
  • compare the search query vector with all the image vectors

This comparison is done using a dot product to get a similarity score for each image. The operation is performed in Python. Once we have a similarity score we pick the top N (say, 256) most similar images and return those as the results.

Will this scale?

Haha probably not. I've tested my naive implementation on up to 2048 images and it runs OK (~3s / query). There are specialized Postgres extensions and vector similarity databases that you can use if you want to do this for tens of thousands of images.

Contributing

If you want to help out, make a pull request and/or email me at [email protected] or DM me on Twitter. Probably better to talk to me first before writing a bunch of code.

wagtail-clip's People

Contributors

mattsegal avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.