Giter Site home page Giter Site logo

tobsky / docuquery Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 1.62 MB

This Streamlit application demonstrates the integration of ChatGroq (Llama3 model), OpenAIEmbeddings, and FAISS for document embedding and retrieval.

Python 87.83% Dockerfile 12.17%
document-embedding generative-ai groq llama3 openai rag retreival

docuquery's Introduction

DocuQuery with Llama3 and Groq Demo

This Streamlit application demonstrates the integration of ChatGroq (Llama3 model), OpenAIEmbeddings, and FAISS for document embedding and retrieval. Users can input questions, and the app retrieves relevant documents and provides accurate responses based on the provided context.

Features

  • Document Embedding: Embed documents using OpenAI embeddings and store them using FAISS.
  • Question Answering: Answer user questions based on embedded documents using ChatGroq's Llama3 model.
  • Document Similarity Search: Display similar documents related to the user's query.

Installation

To run this application locally, follow these steps:

  1. Prerequisites
  1. Clone the Repository

    git clone https://github.com/Tobsky/DocuQuery
    cd yourrepository
    
  2. Set Up Environment Variables

    Create a .env file in the root directory of the project and add your OpenAI and Groq API keys:

    OPENAI_API_KEY=your_openai_api_key
    GROQ_API_KEY=your_groq_api_key
    
  3. Install Dependencies

    pip install -r requirements.txt
    
  4. Run the Application

    streamlit run app.py
    

Usage

  1. Embedding Documents: Click the "Embed Documents" button to process and embed the documents located in the ./PDFdocs directory.
  2. Ask a Question: Enter your question in the text input field and press Enter. The app will retrieve relevant documents and provide an answer based on the context.
  3. View Similar Documents: Expand the "Document Similarity Search" section to view similar documents related to your query.

Code Overview

Main Components

  1. Environment Setup: Load API keys from the .env file using dotenv.
  2. Document Embedding: Embed documents using OpenAI embeddings and store them with FAISS.
  3. Question Answering: Use ChatGroq's Llama3 model to answer questions based on the provided context.
  4. Streamlit Interface: Provide a user interface to embed documents, ask questions, and view similar documents.

Key Functions

  1. vector_embedding(): Handles document embedding and vector store creation.
  2. create_stuff_documents_chain(): Combines documents to form a chain for processing.
  3. create_retrieval_chain(): Creates a retrieval chain to fetch relevant documents based on user queries.

Troubleshooting

Common Errors

  1. Rate Limit Error: If you exceed the API quota, consider upgrading your OpenAI plan or reducing the number of API calls.
  2. Environment Variable Errors: Ensure your .env file is correctly set up with valid API keys.
  3. Document Loading Issues: Verify that the document directory (./PDFdocs) exists and contains valid PDF files.

Contributing

If you would like to contribute to this project, please fork the repository and submit a pull request with your changes.

docuquery's People

Contributors

tobsky avatar

Stargazers

Vince Fulco--Bighire.tools avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.