Automatic Inter-document Multi-hop Scientific QA Generation

πŸ“… 2026-03-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing approaches to scientific question answering are largely confined to single-document, factoid-style queries and lack the capacity for multi-hop reasoning across documents. To address this limitation, this work proposes AIM-SciQA, a novel framework that enables the first automated construction of cross-document, multi-hop scientific question answering datasets. Leveraging large language models, the framework extracts single-hop questions from scientific literature and combines semantic embedding alignment with citation information to synthesize logically coherent and interpretable multi-hop questions. Applied to 8,211 PubMed Central articles, the method generates 411,409 single-hop and 13,672 multi-hop questions, yielding the IM-SciQA and CIM-SciQA datasets. These datasets demonstrate high factual consistency and effectively evaluate models’ complex reasoning capabilities.

Technology Category

Application Category

πŸ“ Abstract
Existing automatic scientific question generation studies mainly focus on single-document factoid QA, overlooking the inter-document reasoning crucial for scientific understanding. We present AIM-SciQA, an automated framework for generating multi-document, multi-hop scientific QA datasets. AIM-SciQA extracts single-hop QAs using large language models (LLMs) with machine reading comprehension and constructs cross-document relations based on embedding-based semantic alignment while selectively leveraging citation information. Applied to 8,211 PubMed Central papers, it produced 411,409 single-hop and 13,672 multi-hop QAs, forming the IM-SciQA dataset. Human and automatic validation confirmed high factual consistency, and experimental results demonstrate that IM-SciQA effectively differentiates reasoning capabilities across retrieval and QA stages, providing a realistic and interpretable benchmark for retrieval-augmented scientific reasoning. We further extend this framework to construct CIM-SciQA, a citation-guided variant achieving comparable performance to the Oracle setting, reinforcing the dataset's validity and generality.
Problem

Research questions and friction points this paper is trying to address.

multi-hop reasoning
scientific question answering
inter-document reasoning
question generation
retrieval-augmented reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-hop reasoning
scientific question generation
cross-document alignment
retrieval-augmented QA
citation-guided construction
πŸ”Ž Similar Papers
No similar papers found.
S
Seungmin Lee
Yonsei University, Seoul, Republic of Korea; OnomaAI, Seoul, Republic of Korea
Dongha Kim
Dongha Kim
Arizona State University
Y
Yuni Jeon
Yonsei University, Seoul, Republic of Korea; OnomaAI, Seoul, Republic of Korea
J
Junyoung Koh
Yonsei University, Seoul, Republic of Korea
Min Song
Min Song
Yonsei University
Text MiningSocial Media Mining