Giter Site home page Giter Site logo

naveentnj / conversation_with_pdf_summary Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 119 KB

Worked with both both Open AI API and HuggingFace API for PDF Text Extraction and summarization in chat format and ask queries related to the PDF

Python 100.00%

conversation_with_pdf_summary's Introduction

Conversation_with_PDF_Summary

Worked with both both Open AI API and HuggingFace API for PDF Text Extraction and summarization in chat format and ask queries related to the PDF

Used Local Cache files for text embedding model from Hugging Face

Description about the Project

1. Used Langchain tools to convert PDF text data into vector data and store it in a FAISS database.
This allows the project to represent PDF text data in a way that can be efficiently processed by LLMs. LLMs can then be used to query and summarize the PDF text data quickly and accurately.

2. Integrated OpenAI GPT API to query and summarize multiple PDF text documents.
This allows the project to provide efficient retrieval of relevant information from multiple PDF documents simultaneously. This is useful for tasks such as comparing two or more documents, or finding information that is scattered across multiple documents.

3. Used Conversation Buffer Memory and Conversational Retrieval Chain to store the queries and answers.
This allows the project to keep track of the conversation history and use it to inform subsequent responses. This makes the chat app more conversational and engaging, and allows it to provide more relevant and helpful answers to user queries.

4. Used Hugging Face Transformers to access and fine-tune LLMs.
Hugging Face Transformers is a popular library for natural language processing (NLP) that provides a unified API for accessing and fine-tuning LLMs. The project uses this library to access the OpenAI GPT API and to fine-tune the model on its own dataset of PDF text data. This allows the project to improve the performance of the model on its specific task.

5. Used Langchain to integrate the different components of the system.
Langchain is a library that provides tools for building and deploying large-scale NLP applications. The project uses Langchain to integrate the different components of the system, such as the PDF text extractor, the text-to-embedding converter, the FAISS database, and the OpenAI GPT API. This simplifies the development and deployment of the system.

6. Provided a chat interface for users to query the summarized PDF text using streamlit.
- The project creates a Streamlit app that loads the pre-trained LLM and the FAISS database.
- The app also creates a chat interface using Streamlit widgets.
- When a user types in a query, the app sends the query to the LLM.
- The LLM generates a response, which is then displayed to the user in the chat interface.
- The app also stores the conversation history in memory, so that the LLM can use it to inform subsequent responses.
- The Streamlit app can be deployed to the cloud using a service such as Heroku, so that it can be accessed by users from anywhere in the world.

Overall, the project "Chat with PDF using Hugging Face and Open AI" with Conversation Buffer Memory and Conversational Retrieval Chain makes effective use of LLMs to provide a chat interface for users to query and summarize multiple PDF text documents efficiently.

conversation_with_pdf_summary's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.