A Domain-Specific Curated Benchmark for Entity and Document-Level Relation Extraction

📅 2026-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing biomedical information extraction benchmarks predominantly rely on distant supervision or automatically generated annotations, resulting in limited coverage and insufficient support for developing robust methods. To address this gap, this work introduces GutBrainIE—the first high-quality, multi-task benchmark focused on the gut–brain axis. Constructed from over 1,600 PubMed abstracts, it features expert-curated annotations of entities, concept-level normalizations, and document-level relations. By integrating manually annotated data with weakly supervised examples, GutBrainIE enables joint modeling of named entity recognition, concept linking, and relation extraction, substantially enhancing the reliability of system evaluation and cross-task generalization in biomedical information extraction.

Technology Category

Application Category

📝 Abstract
Information Extraction (IE), encompassing Named Entity Recognition (NER), Named Entity Linking (NEL), and Relation Extraction (RE), is critical for transforming the rapidly growing volume of scientific publications into structured, actionable knowledge. This need is especially evident in fast-evolving biomedical fields such as the gut-brain axis, where research investigates complex interactions between the gut microbiota and brain-related disorders. Existing biomedical IE benchmarks, however, are often narrow in scope and rely heavily on distantly supervised or automatically generated annotations, limiting their utility for advancing robust IE methods. We introduce GutBrainIE, a benchmark based on more than 1,600 PubMed abstracts, manually annotated by biomedical and terminological experts with fine-grained entities, concept-level links, and relations. While grounded in the gut-brain axis, the benchmark's rich schema, multiple tasks, and combination of highly curated and weakly supervised data make it broadly applicable to the development and evaluation of biomedical IE systems across domains.
Problem

Research questions and friction points this paper is trying to address.

Information Extraction
Relation Extraction
Biomedical Benchmark
Named Entity Recognition
Gut-Brain Axis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain-Specific Benchmark
Manual Curation
Entity and Relation Extraction
Biomedical Information Extraction
Gut-Brain Axis
🔎 Similar Papers
No similar papers found.