ChiMDQA: Towards Comprehensive Chinese Document QA with Fine-Grained Evaluation

📅 2025-11-05
🏛️ International Conference on Artificial Neural Networks
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the scarcity of high-quality benchmark datasets for multi-domain Chinese long-document question answering (QA), this paper introduces ChiMDQA, a fine-grained Chinese document QA dataset covering six domains: academia, education, finance, law, healthcare, and news. It comprises 6,068 human-annotated QA pairs, constructed via a cross-domain fine-grained taxonomy and a systematic question-generation methodology, supporting tasks including document understanding, knowledge extraction, and multi-document QA. Rigorous document selection, multi-round human verification, and quality control ensure annotation consistency and semantic accuracy. A dedicated fine-grained evaluation framework tailored for Chinese long texts is also proposed. The dataset and code are publicly released. Empirical evaluations demonstrate that models fine-tuned on ChiMDQA significantly outperform those trained on existing benchmarks across multiple downstream QA tasks, establishing ChiMDQA as a reliable, diverse, and highly adaptable new benchmark for Chinese document intelligence research.

Technology Category

Application Category

📝 Abstract
With the rapid advancement of natural language processing (NLP) technologies, the demand for high-quality Chinese document question-answering datasets is steadily growing. To address this issue, we present the Chinese Multi-Document Question Answering Dataset(ChiMDQA), specifically designed for downstream business scenarios across prevalent domains including academic, education, finance, law, medical treatment, and news. ChiMDQA encompasses long-form documents from six distinct fields, consisting of 6,068 rigorously curated, high-quality question-answer (QA) pairs further classified into ten fine-grained categories. Through meticulous document screening and a systematic question-design methodology, the dataset guarantees both diversity and high quality, rendering it applicable to various NLP tasks such as document comprehension, knowledge extraction, and intelligent QA systems. Additionally, this paper offers a comprehensive overview of the dataset's design objectives, construction methodologies, and fine-grained evaluation system, supplying a substantial foundation for future research and practical applications in Chinese QA. The code and data are available at: https://anonymous.4open.science/r/Foxit-CHiMDQA/.
Problem

Research questions and friction points this paper is trying to address.

Addressing the lack of high-quality Chinese document QA datasets
Providing fine-grained evaluation across six diverse business domains
Supporting NLP tasks like document comprehension and knowledge extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed Chinese multi-document QA dataset
Implemented fine-grained categorization of QA pairs
Established systematic document screening methodology
🔎 Similar Papers
No similar papers found.
J
Jing Gao
Beijing Jiaotong University, Beijing, China
S
Shutiao Luo
Beijing University of Posts and Telecommunications, Beijing, China
Yumeng Liu
Yumeng Liu
PhD student, The University of HongKong
Motion PlanningRobotic Manipulation
Y
Yuanming Li
Foxit Software Co. Ltd, Fuzhou, China
H
Hongji Zeng
Foxit Software Co. Ltd, Fuzhou, China