KazQAD is a Kazakh open-domain Question Answering Dataset that can be used in both reading comprehension and full ODQA settings, as well as for information retrieval experiments.
- Kazakh Wikipedia (a dump of 01-Jan-2023)
- Google's Natural Questions (NQ)
- Unified National Testing (UNT) questions
The questions come from two sources: translated items from the Natural Questions (NQ) dataset (only for training) and the original Kazakh Unified National Testing (UNT) exam (for development and testing).
@misc{kazqad,
title={KazQAD: Kazakh Open-Domain Question Answering Dataset},
author={Rustem Yeshpanov and Pavel Efimov and Leonid Boytsov and Ardak Shalkarbayuli and Pavel Braslavski},
year={2024},
eprint={2404.04487},
archivePrefix={arXiv},
primaryClass={cs.CL}
}