CMCQA is a huge conversational question-and-answer data set for the Chinese medical field. It is collected from the Chinese medical conversational question answering website ChunYu, and has medical conversational materials in 45 departments, such as andrology, stormotologry, gynaecology and obstetrics. Specifically, CMCQA has 1.3 million complete sessions or 19.83 million statements or 0.65 billion tokens. At the same time, we further open source all data to promote the development of related fields of conversational question answering in the medical field.
CMCQA是**医学领域一个庞大的会话问答数据集。它来自**医学对话问答网站春雨,在男科、耳科、妇产科等45个科室拥有医学对话材料。具体而言,CMCQA拥有130万个完整会话或1983万条语句或6.5亿个令牌,总容量2.84GB。同时,我们进一步开放所有数据源,促进医学领域对话式问答相关领域的发展。
You can find our data in Google drive
你可以从百度网盘中下载数据集
@misc{CMCQA,
title={A large Chinese Medical CQA},
author={Yixuan Weng},
howpublished={\url{https://github.com/WENGSYX/CMCQA}},
year={2022}
}