Giter Site home page Giter Site logo

searchanything's Introduction

SearchAnything

"SearchAnything" is a local semantic search engine, powered by various AI models, which allows you to search sentences and images based on their semantic meanings.

Check out our demo video to see how it works. [Zhihu blog] [Twitter]

text_demo.mov
image_demo.mov

pipeline

Installation

First, clone our repository: git clone [email protected]:Immortalise/SearchAnything.git

We recommend using a Conda environment to manage your Python dependencies as it allows you to isolate your Python environment.

Use the following commands to set up the environment for "SearchAnything":

conda env create -f env.yaml
conda activate anything

Please note that on MacOS systems, executing conda env create -f env.yaml may result in errors due to the CUDA packages and some other packages. We are currently addressing this issue and working on improving the MacOS environment compatibility.

Running

Adding Files

Start the application by running python anything.py in the console.

Upon running, you will see the following instructions:

[nltk_data] Downloading package punkt to /xxx/nltk_data... 
[nltk_data] Package punkt is already up-to-date! 
Adding text embedding model 
Adding image embedding model 
SearchAnything v1.0 Type 'exit' to exit.   
Type 'insert' to parse file.   
Type 'search' to search file.   
Type 'delete' to delete file. 
Instruction:

Type insert, followed by the file path. Please note that the file path can either be a single file or a directory. If a directory is specified, all supported files in the directory will be parsed and saved to the database.

Search Files

When searching files, you can also use a more user-friendly web interface by running:

streamlit run app.py

In this local web interface, you can search files based on their semantic meanings.

Supported File Types

We currently support the following file types:

  • Text: pdf, txt, md
  • Image: jpg, jpeg, png

Implementation

"SearchAnything" primarily involves two steps:

Embedding

Given a text or images, they are first processed into a vector (embedding). The main AI models for semantic search are based on the sentence-transformer repository.

Semantic search for text: all-mpnet-base-v2

Semantic search for images: clip-ViT-B-32

Save and Retrieve

After generating the embedding for each image and text, we save the embedding along with the file path into a database.

When given a query and a search type, we process the query into an embedding $e_q$, then retrieve all embeddings $[e_1, e_2, ..., e_n]$ from the database. We then calculate the cosine similarity between the query embedding e_q and each of $[e_1, e_2, ..., e_n]$, sort them in descending order, and return the results.

Privacy Protection

SearchAnything downloads the most advanced AI models to run locally, so there's no need to worry about your private data being compromised! Text semantic search only requires about 400MB of memory space, while image semantic search requires around 4GB of memory. We will add more models in the future to make it easier for users with different memory sizes to use.

TODOs

We're eager to hear your valuable feedback and constructive suggestions!

Here are some features we plan to implement in the future:

Functions

  • Support for deleting files.
  • Support audio input.
  • Support image query.
  • Autonomous monitoring of file changes and automatic add/delete files into database when files are added/deleted.
  • Support for semantic search of audio resources.
  • Test ImageBind model.
  • Support for MacOS system.

Text Semantic Search

  • Support for .pptx and .docx formats.
  • Integration with BM25 and Exact search.

Related work

Note the recent Github library: Semantra is similar to our SearchAnything and aims to facilitate semantic search of documents. SearchAnything differs from Semantra in the following ways:

  • flexible file support: SearchAnything supports not only text files (PDF, TXT, MD), but also images (e.g. JPG, JPEG, PNG). We are constantly working on expanding the range of supported file types.
  • Integration of different AI models: SearchAnything integrates different AI models based on the Sentence Transformer library. This approach ensures a powerful and versatile semantic understanding of a wide range of data.
  • User-friendly interface: SearchAnything is equipped with a native interface provided by Streamlit, which is designed to be very friendly for non-technical users.

Authors

Acknowledgements

searchanything's People

Contributors

immortalise avatar icegrave0391 avatar jindongwang avatar houwenxin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.