BLINKG: A Benchmark for LLM-Integrated Knowledge Graph Generation

📅 2026-05-19

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work addresses the persistent reliance on manual effort in aligning schema elements from heterogeneous data sources with ontology terms during knowledge graph construction, as well as the absence of standardized benchmarks for evaluating large language models’ (LLMs) semantic mapping capabilities in this context. To bridge this gap, we introduce BLINKG—the first benchmark specifically designed to assess LLM performance in semantic mapping for knowledge graph generation. BLINKG features a suite of progressively complex, real-world-inspired mapping tasks and an extensible evaluation framework. Experimental results demonstrate that while current LLMs perform adequately on simple mappings, they exhibit significant limitations in more complex scenarios, thereby providing crucial quantitative insights and actionable directions for advancing semi-automated knowledge graph construction.

📝 Abstract

Generating Knowledge Graphs (KGs) remains one of the most time-consuming and labor-intensive tasks for knowledge engineers, as they need to identify semantic equivalences between input data sources and ontology terms. While declarative solutions (e.g., RML, SPARQL-Anything) have helped to generalize this process, aligning input schema elements with ontology terms still involves intricate transformations and requires considerable manual effort. With the advent of Large Language Models (LLMs), there is growing interest in leveraging their capabilities to assist KG engineers. Although some studies have explored using LLMs to automate KG construction, there is still no standardized framework for assessing how effectively they establish correspondences between data schemes and ontology concepts. Therefore, in this paper, we propose BLINKG, a benchmark designed to evaluate the mapping capabilities of LLMs in constructing KGs from heterogeneous data sources. The benchmark includes a set of scenarios with increasing complexity, based on real-world use cases. We conduct an extensive experimental evaluation of several stateof-the-art LLMs using BLINK and observe that they already offer promising solutions. However, their performance remains limited in complex scenarios. Thanks to this benchmark, we can already assess the current capabilities of LLMs for KG construction. Additionally, we define a set of requirements for achieving (semi)automated (LLM-driven) KG construction, opening new research lines in this area.

Problem

Research questions and friction points this paper is trying to address.

Knowledge Graph Generation

Large Language Models

Ontology Alignment

Benchmarking

Schema Matching

Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Graph Generation

Large Language Models

Schema-to-Ontology Mapping