A Unified Benchmark for Evaluating Knowledge Graph Construction Methods and Graph Neural Networks

📅 2026-05-06

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Existing automatically constructed knowledge graphs often suffer from noise, fragmentation, and semantic inconsistencies, making it difficult to disentangle whether performance differences in graph neural networks stem from the models themselves or from variations in graph quality. To address this, this work proposes the first dual-objective benchmark that generates multiple automatically constructed graphs from the same biomedical text corpus and incorporates an expert-annotated high-quality reference graph. Through semi-supervised node classification tasks, the framework jointly evaluates the effectiveness of knowledge graph construction methods and the robustness of graph neural networks under realistic noise conditions. This approach enables standardized, reproducible, and scalable co-evaluation, facilitating fair comparisons across graph construction techniques and revealing the upper performance bounds of downstream models.

📝 Abstract

Knowledge graphs automatically constructed from text are increasingly used in real-world applications. However, their inherent noise, fragmentation, and semantic inconsistencies significantly affect the performance of Graph Neural Networks (GNNs) on downstream tasks. Assessing their performance and robustness remains difficult, as it is often unclear whether observed results stem from the learning model or from the quality of the constructed graph itself. In this work, we introduce a dual-purpose benchmark designed to jointly evaluate (i) the performance of GNNs on noisy, text-derived graphs and (ii) the effectiveness of graph construction methods on a downstream task. The benchmark is built in the biomedical domain from a single textual corpus and includes two automatically constructed graphs generated using different extraction methods, alongside a high-quality reference graph curated by experts that serves as an upper performance bound. This design enables controlled comparison of construction methods and systematic evaluation of GNN robustness through semi-supervised node classification. We further provide a standardized, reproducible, and extensible evaluation framework, facilitating the integration of new graph extraction methods and learning models.

Problem

Research questions and friction points this paper is trying to address.

Knowledge Graph Construction

Graph Neural Networks

Noise

Evaluation Benchmark

Semantic Inconsistency

Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge graph construction

graph neural networks

benchmark