Comments (1)
For evaluating Retrieval-Augmented Generation (RAG) models like the ones you’re working with, the choice of a validation dataset can significantly influence how well the model’s performance generalizes across different types of data and use cases.
If your RAG model is intended to be used in a specific domain (like medical, legal, or technical documents), it would be beneficial to use a domain-specific validation dataset. This approach helps ensure that the model performs well on the type of content it will encounter in its expected environment.
However, if the model is intended for more general use, a generic labeled dataset could suffice. This kind of dataset helps evaluate the model’s ability to handle a broad range of topics and types of queries.
In your case, it be beneficial to use a domain-specific validation dataset tailored to the RAG domain to accurately evaluate your RAG model using ARES.
from ares.
Related Issues (20)
- RAGAS score calculation from annotations is unclear HOT 1
- [Feature Request] AWS Bedrock / Anthropic Claude HOT 1
- --labels <label columns> HOT 2
- [Feature Request] Multilingual support HOT 1
- Documentation and code are so broken! HOT 5
- Checkpoint folder is not created automatically after training classifiers HOT 1
- Switch openai embeddings to local multilingual embeddings HOT 2
- New README file instructions are incorrect HOT 1
- Unable to import without setting OpenAI key HOT 1
- Precision-Performance Iteration (PPI) in README HOT 1
- Iteration over labels and datasets not working in PPI HOT 4
- Missing packages in linked Colab notebook HOT 2
- getting TypeError: 'type' object is not subscriptable when importing the package HOT 1
- [bug] error during import - vLLM not imported. HOT 1
- Missing protobuf dependencies in ARES 0.6.1 PyPi package HOT 1
- Evaluating more than one dataset at a time returns incorrect results
- Evaluation process only works with demo datasets, fails with any real dataset (that has only the columns described in the paper)
- None of the tutorials work HOT 8
- ARES as a Chunk Reranker in a RAG app? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ares.