This repository contains a FastAPI server for performing semantic search and question-answering tasks using pre-trained language models and document embeddings. It leverages several NLP libraries and models to provide efficient and accurate results.
The FastAPI Semantic Search Server is designed to provide a RESTful API for querying a collection of documents and obtaining answers to questions based on the content of those documents. It integrates various components and models, including:
-
Document Loaders: It loads documents from a specified directory using various document loaders, such as PyMuPDFLoader.
-
Document Splitting: It splits loaded documents into smaller chunks to enable efficient processing and searching.
-
Sentence Embeddings: It uses SentenceTransformerEmbeddings to convert text into high-dimensional vectors, allowing for semantic similarity comparisons.
-
Vector Stores: It stores and indexes document embeddings efficiently using Chroma.
-
Language Models: It loads a language model from the Hugging Face Model Hub to perform question-answering tasks.
-
Question Answering Chain: It sets up a question-answering chain using the loaded language model and document embeddings.
The server exposes endpoints for querying the model with questions and retrieving answers along with relevant source documents.
Before using this server, ensure you have the following prerequisites:
- Python 3.7, 3.8, or 3.9
- Required Python packages (can be installed using
pip
):fastapi
torch
langchain
(You may need to install this library separately)
To install the required packages, you can use pip
:
pip install fastapi torch langchain sentence-transformers faiss-cpu numpy psutil matplotlib
To use the FastAPI Semantic Search Server, follow these steps:
- Clone this repository and navigate to the project directory:
git clone https://github.com/karnikkanojia/SemanticSearchDB.git
cd repo-directory
-
Set up the necessary configurations and models (see Configuration).
-
Start the FastAPI server:
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
- Access the server's API at
http://localhost:8000/
in your web browser or use a tool like curl or Postman to make API requests (see Endpoints).
The server's configuration and models can be set up in the startup_event
function in the main.py
file. Here are some key configuration steps:
- Load documents from a specified directory using
load_docs
. - Split the documents into chunks using
split_docs
. - Set up embeddings, vector stores, and language models.
- Configure the question-answering chain using the loaded models.
You can customize the document loading, splitting, and model setup to fit your specific use case.
Setup .env
HUGGINGFACEHUB_API_TOKEN=<TOKEN>
The server exposes the following API endpoints:
-
GET /
: A simple endpoint that returns a "Hello World" message. You can use this to verify that the server is running. -
GET /query/{question}
: This endpoint allows you to query the model with a given question and receive an answer. It also provides information about the sources used to generate the answer, including content, metadata, and scores.
Example usage:
curl http://localhost:8000/query/your-question
This project is licensed under the MIT License. See the LICENSE file for details.
Feel free to customize and extend this FastAPI Semantic Search Server to meet your specific requirements.