Compare base model answers vs a simple RAG version.
As documents for RAG use the dataset created (text from Web and Pdfs) using public data extracted from https://www.thoughtworks.com, check the other notebook to see the scrapper.
Comments from this first approach:
Checking the sample generated at the end of the notebook, you can see that the RAG version reduces some hallucinations compared with the base mistral model.
Also reduce the verbosity in the answer.
The retriever strategy is very basic, it affects a lot the results.
The base model seems to be trained with data until 2022, so it fails for questions that involve more recent content, like the Rag topics, it supports the benefits of having a RAG strategy. But this should be compared with a fine-tuning approach.
There is a bug in the model prediction because the batch call does not work, it is not using the extra GPU correctly.