OAEI-LLM: A Benchmark Dataset for Understanding Large Language Model Hallucinations in Ontology Matching

📅 2024-09-21
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF

career value

209K/year
🤖 AI Summary
Large language models (LLMs) exhibit semantic hallucinations—such as spurious alignments, fabricated entities, and logical inconsistencies—when performing ontology matching, undermining reliability in knowledge-intensive tasks. Method: We introduce OAEI-LLM, the first LLM hallucination benchmark tailored to middle-school-level knowledge classification, extending the international OAEI standard. We formally define and annotate LLM-specific hallucination types in ontology matching, propose a hallucination-aware data construction paradigm with schema extension mechanisms, and ensure annotation reliability via multi-stage validation: human verification, rule-based consistency checks, expert review, and cross-model agreement analysis. Contribution/Results: We release a multi-domain ontology alignment benchmark with fine-grained hallucination annotations, enabling rigorous evaluation of hallucination detection methods, robust semantic alignment techniques, and trustworthy LLM-based ontology matching (LLM-OM) approaches.

Technology Category

Application Category

📝 Abstract
Hallucinations of large language models (LLMs) commonly occur in domain-specific downstream tasks, with no exception in ontology matching (OM). The prevalence of using LLMs for OM raises the need for benchmarks to better understand LLM hallucinations. The OAEI-LLM dataset is an extended version of the Ontology Alignment Evaluation Initiative (OAEI) datasets that evaluate LLM-specific hallucinations in OM tasks. We outline the methodology used in dataset construction and schema extension, and provide examples of potential use cases.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Ontology Alignment
Educational Content
Innovation

Methods, ideas, or system contributions that make the work stand out.

OAEI-LLM
Ontology Matching
Hallucination Errors
🔎 Similar Papers