Large Language Models benchmark to evaluate performance in Retrieved-Augmented Generation tasks.
andreacecchin / retrievedrelevantcontextqa Goto Github PK
View Code? Open in Web Editor NEWLarge Language Models benchmark to evaluate performance in Retrieved-Augmented Generation tasks.