BioPIE: A Biomedical Protocol Information Extraction Dataset for High-Reasoning-Complexity Experiment Question Answer

πŸ“… 2026-01-08
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge that existing biomedical datasets struggle to support complex experimental question answering requiring high information density and multi-step reasoning. To bridge this gap, the authors propose BioPIEβ€”the first fine-grained knowledge graph dataset centered on biomedical experimental procedures. BioPIE constructs structured knowledge representations by systematically extracting entities, actions, and their interrelationships from experimental protocols. This approach fills a critical void in structured knowledge for complex experimental scenarios and is integrated into a question-answering system to enable multi-hop reasoning. Experimental results demonstrate that the BioPIE-enhanced system significantly outperforms baseline models on both standard benchmarks and challenging questions, thereby substantially improving AI’s capacity to assist with experimental design and analysis.

Technology Category

Application Category

πŸ“ Abstract
Question Answer (QA) systems for biomedical experiments facilitate cross-disciplinary communication, and serve as a foundation for downstream tasks, e.g., laboratory automation. High Information Density (HID) and Multi-Step Reasoning (MSR) pose unique challenges for biomedical experimental QA. While extracting structured knowledge, e.g., Knowledge Graphs (KGs), can substantially benefit biomedical experimental QA. Existing biomedical datasets focus on general or coarsegrained knowledge and thus fail to support the fine-grained experimental reasoning demanded by HID and MSR. To address this gap, we introduce Biomedical Protocol Information Extraction Dataset (BioPIE), a dataset that provides procedure-centric KGs of experimental entities, actions, and relations at a scale that supports reasoning over biomedical experiments across protocols. We evaluate information extraction methods on BioPIE, and implement a QA system that leverages BioPIE, showcasing performance gains on test, HID, and MSR question sets, showing that the structured experimental knowledge in BioPIE underpins both AI-assisted and more autonomous biomedical experimentation.
Problem

Research questions and friction points this paper is trying to address.

High Information Density
Multi-Step Reasoning
Biomedical Experiment QA
Fine-grained Knowledge
Protocol Understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Biomedical Protocol Information Extraction
High Information Density
Multi-Step Reasoning
Knowledge Graph
Question Answering
πŸ”Ž Similar Papers
No similar papers found.
H
Haofei Hou
School of Advanced Manufacturing and Robotics, Peking University
Shunyi Zhao
Shunyi Zhao
Jiangnan University
State EstimationStatistical Estimation TheoryFault Detection and Diagnosis
F
Fanxu Meng
School of Advanced Manufacturing and Robotics, Peking University
K
Kairui Yang
School of Advanced Manufacturing and Robotics, Peking University
Lecheng Ruan
Lecheng Ruan
HIT / UCLA / PKU
RoboticsControlKnowledge Representation
Q
Qining Wang
School of Advanced Manufacturing and Robotics, Peking University