This tutorial uses similar concepts to ai-rag-app but is a more advanced version. In addition to the first RAG app, this one runs locally using open-source LLMs from AWS, allows you to update your database with new entries, and provides the ability to test/evaluate our AI generated responses.
-
Do the following before installing the dependencies found in
requirements.txt
file because of current challenges installingonnxruntime
throughpip install onnxruntime
.-
For Windows users, follow the guide here to install the Microsoft C++ Build Tools. Be sure to follow through to the last step to set the enviroment variable path.
-
For MacOS users, a workaround is to first install
onnxruntime
dependency forchromadb
using:
conda install onnxruntime -c conda-forge
See this thread for additonal help if needed.
-
-
Now run this command to install dependenies in the
requirements.txt
file.
pip install -r requirements.txt
Gather PDFs that you would like to use as your source material and place them in your data folder.
If you want to load other types of documents, go to Langchain Document Loaders and select a document loader of your choice. There are also 3rd party document loaders here.
Most PDFs are too big to use on their own. We need to split them into manageable chunks using Lanchain's recursive text splitter.
Create a function that returns an embedding function as its own file since we will use this function in two separate places: 1) when we create the database itself and 2) when we query the database.
I used AWS Bedrock for my embedding integrations but you can use any of these embedding models.
Update your database using Chroma
run your files in a terminal:
python get_embedding_function.py
then
python populate_database.py
then add a prompt to your query file:
python query_data.py "who is Andrew Huberman?"
Quality of answers will depend on:
Ask a question that you know the answer to with high certainty. For example,
Question: "What does PRP stand for?" Expected response: "Platelet Rich Plasma"
test the file using
python test_rag.py
or
pytest