Giter Site home page Giter Site logo

karnikkanojia / semanticsearchdb Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 1.0 43 KB

Semantic search bot leveraging LLMs. Enhances search with advanced language understanding. Pre-trained models, data pipelines, and optimized algorithms provided. Accurate, context-aware results for articles, knowledge bases, and complex queries. Unlock the power of semantic search in this repository.

License: Apache License 2.0

Dockerfile 0.42% Jupyter Notebook 58.69% Python 40.89%

semanticsearchdb's Introduction

FastAPI Semantic Search Server

Python License

This repository contains a FastAPI server for performing semantic search and question-answering tasks using pre-trained language models and document embeddings. It leverages several NLP libraries and models to provide efficient and accurate results.

Table of Contents

Introduction

The FastAPI Semantic Search Server is designed to provide a RESTful API for querying a collection of documents and obtaining answers to questions based on the content of those documents. It integrates various components and models, including:

  • Document Loaders: It loads documents from a specified directory using various document loaders, such as PyMuPDFLoader.

  • Document Splitting: It splits loaded documents into smaller chunks to enable efficient processing and searching.

  • Sentence Embeddings: It uses SentenceTransformerEmbeddings to convert text into high-dimensional vectors, allowing for semantic similarity comparisons.

  • Vector Stores: It stores and indexes document embeddings efficiently using Chroma.

  • Language Models: It loads a language model from the Hugging Face Model Hub to perform question-answering tasks.

  • Question Answering Chain: It sets up a question-answering chain using the loaded language model and document embeddings.

The server exposes endpoints for querying the model with questions and retrieving answers along with relevant source documents.

Prerequisites

Before using this server, ensure you have the following prerequisites:

  • Python 3.7, 3.8, or 3.9
  • Required Python packages (can be installed using pip):
    • fastapi
    • torch
    • langchain (You may need to install this library separately)

Installation

To install the required packages, you can use pip:

pip install fastapi torch langchain sentence-transformers faiss-cpu numpy psutil matplotlib

Usage

To use the FastAPI Semantic Search Server, follow these steps:

  1. Clone this repository and navigate to the project directory:
git clone https://github.com/karnikkanojia/SemanticSearchDB.git
cd repo-directory
  1. Set up the necessary configurations and models (see Configuration).

  2. Start the FastAPI server:

uvicorn main:app --host 0.0.0.0 --port 8000 --reload
  1. Access the server's API at http://localhost:8000/ in your web browser or use a tool like curl or Postman to make API requests (see Endpoints).

Configuration

The server's configuration and models can be set up in the startup_event function in the main.py file. Here are some key configuration steps:

  • Load documents from a specified directory using load_docs.
  • Split the documents into chunks using split_docs.
  • Set up embeddings, vector stores, and language models.
  • Configure the question-answering chain using the loaded models.

You can customize the document loading, splitting, and model setup to fit your specific use case.

Setup .env

HUGGINGFACEHUB_API_TOKEN=<TOKEN>

Endpoints

The server exposes the following API endpoints:

  • GET /: A simple endpoint that returns a "Hello World" message. You can use this to verify that the server is running.

  • GET /query/{question}: This endpoint allows you to query the model with a given question and receive an answer. It also provides information about the sources used to generate the answer, including content, metadata, and scores.

Example usage:

curl http://localhost:8000/query/your-question

License

This project is licensed under the MIT License. See the LICENSE file for details.

Feel free to customize and extend this FastAPI Semantic Search Server to meet your specific requirements.

semanticsearchdb's People

Contributors

karnikkanojia avatar

Watchers

 avatar

Forkers

runzel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.