Giter Site home page Giter Site logo

zmon3y / langchain-cohere-qdrant-doc-retrieval Goto Github PK

View Code? Open in Web Editor NEW

This project forked from menloparklab/langchain-cohere-qdrant-doc-retrieval

0.0 0.0 0.0 10 KB

This Flask backend API takes a document in multiple formats and allows you to perform semantic search using Langchain, Cohere and Qdrant.

Python 100.00%

langchain-cohere-qdrant-doc-retrieval's Introduction

langchain-cohere-qdrant-doc-retrieval

This Flask backend API takes a document in multiple formats (.txt, .docx, .pptx, .jpg, .png, .eml, .html, and .pdf) and allows you to perform a semantic search in 100+ languages supported by Cohere Multilingual API. Qdrant vector database is used to save embeddings.

Setup

The following steps will guide you on how to run the application on macOS/Linux.

Prerequisites

  • Python 3
  • Git
  • virtualenv
  • Homebrew

Installation

  1. Clone the repository
git clone https://github.com/menloparklab/langchain-cohere-qdrant-doc-retrieval docQA
  1. Change into the directory
cd docQA
  1. Create and activate a virtual environment
python3 -m venv env
source env/bin/activate
  1. Install the required packages
pip install -r requirements.txt
  1. Install Homebrew

Follow the installation guide on Homebrew website.

  1. Install the following brew packages
brew install libmagic poppler tesseract libxml2 libxslt
  1. Create a .env file and set the following environment variables:
cohere_api_key="insert here"
openai_api_key="insert here"
qdrant_url="insert here"
qdrant_api_key="insert here"

Replace the values with your own API keys and Qdrant URL.

Qdrant url and api keys

Please signup for a free cloud-based account of Qdrant and create a new cluster. You will then be able to get the qdrant_url and qdrant_api_key used in the section above.

  1. Run the application using the following command:
gunicorn app:app
  1. Access the API endpoints

The API endpoints will be live at the following routes:

  • /embed
  • /retrieve

Conclusion

You have successfully installed and ran the DocQA system on your local machine. Feel free to explore the code and make changes as per your requirements.

Connecting to a frontend

The deployed api endpoints, /embed and /retrieve can now be called from any frontend application. For bubble users, you can watch this video for detailed instructions.

Include headers for the API: "Content-Type": "application/json"

JSON body for /embed:
{ "collection_name": "{collection_name}", "file_url": "{file_url}" }

JSON body for /retrieve:
{ "collection_name": "{collection_name}", "query": "{query}" }

For Bubble users

Embed JSON for the bubble:
{ "collection_name": "<collection_name>", "file_url": "<file_url>" }

Retrieve JSON for bubble:
{ "collection_name": "<collection_name>", "query": "<query>" }

Feel free to reach out if any questions on Twitter

langchain-cohere-qdrant-doc-retrieval's People

Contributors

misbahsy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.