Mr. RetrieveRite is a tool based on RAG (Retrieval Augmented Generation), designed for effective, efficient, and high-quality search and text generation.
- Overview
- Installation
- Usage
- Model and Data
- API Keys and Credentials
- Contributing
- Acknowledgments
- Contact
Retrieval-Augmented Generation (RAG) enhances large language models (LLMs) by incorporating external knowledge sources, enabling more informed responses beyond the model's training data. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts all without the need to retrain the model.
- Challenges to LLMs : Making things up when it does not have the answer; presenting ou-of-date or generic information.
- Cost-effective implementation : The computational and financial costs of retraining base models for organization or domain-specific information are high. Aditionally APIs charge for processing tokens. Giving only relevant input to generate a response can be cost efficient.
- Current information : LLMs are trained on a data upto a specific date. RAG allows developers to provide the latest research, statistics, or news to the generative models while maintaining relevancy.
- Enhanced user trus : RAG can provide source attribution. The output includes refrences to sources so the user may cross check.
- Langchain, OpenAI API, Hugging Face, Streamlit, FAISS
- For this project Langchain and OpenAI API is used for processing data and giving prompts; Hugging Face's transformer model is used to create embeddings; Faiss is used for retrieval; and Streamlit is used to design UI.
- Customer Service Industry
- Advertising and Marketing
- Education and E-Learning
- Healthcare Industry
- E-commerce and Retail Industry
- Data Ingestion : Used Langchain's UnstructuredURLLoader class to load data from urls.
- Split data into chunks : Used langchain's RecursiveCharacterTextSplitter class.
- Vector DB : Vectorized the chunks using HuggingFaceBgeEmbeddings, create a FAISS vector store using the embeddings and the splits .
- Retrieval and prompt : Retrieve the relevant chunks from the DB store and formulate an LLM prompt. Used RetrievalQAWithSourcesChain class and OpenAI API.
- Phase 1 : Currently the project works as a prototype. You can copy paste three urls from the internet and ask questions. The project has the foundational structure to expand on its capabilities to deal with large information database.
- Host the app on a server
- Make it more application specific and build different products out of it.
- Make the data-ingestion system more robust
- Include a more robust and capable Vector DB
** Python 3.9**
- Clone this repository to your local machine using:
$ git clone https://github.com/Palpendiculal/MisterRetriveRite.git $ cd your_project
- Create a conda environment and install dependencies
$ pip install -r requirements.txt
- Update the .env file with your API key
- To run the app copy the following command in your terminal:
$ streamlit run app.py
- Copy paste urls from the internet and ask questions.
For embeddings used all-mpnet-base-v2 from Huggingface. For prompts used gpt-3.5-turbo-instruct from OpenAI
To generate an API key go to OpenAI website. Generate the key and update your .env
To contribute feel free to fork the repo. Please do not hesitate to contact me on LinkedIn
https://www.lettria.com/blogpost/retrieval-augmented-generation-5-uses-and-their-examples https://colabdoge.medium.com/what-is-rag-retrieval-augmented-generation-b0afc5dd5e79 https://www.hopsworks.ai/dictionary/vector-database https://codebasics.io/
[LinkedIn] (https://www.linkedin.com/in/varun-kumar-singh-b01083148/)
git config --global user.name "Dephinate" git config --global user.email "[email protected]"