Lie to Me: Knowledge Graphs for Robust Hallucination Self-Detection in LLMs

๐Ÿ“… 2025-12-29
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Large language models (LLMs) suffer from hallucinations that are difficult to self-diagnose. Method: This paper proposes a novel zero-shot, knowledge-graph-based self-checking paradigm: automatically parsing LLM outputs into entityโ€“relation graphs and quantifying hallucination probability via atomic fact consistency modeling. The approach is lightweight, model-agnostic, requires no fine-tuning, and entirely avoids reliance on LLM parameters or training data. Contributions/Results: (1) First to structure LLM responses as knowledge graphs to enable hallucination probability modeling; (2) Introduces the first high-quality, human-verified hallucination evaluation dataset; (3) Achieves up to 16% higher accuracy and 20% higher F1-score than SelfCheckGPT on GPT-4o and Gemini-2.5-Flash, demonstrating that graph-structured representation significantly enhances hallucination detection capability.

Technology Category

Application Category

๐Ÿ“ Abstract
Hallucinations, the generation of apparently convincing yet false statements, remain a major barrier to the safe deployment of LLMs. Building on the strong performance of self-detection methods, we examine the use of structured knowledge representations, namely knowledge graphs, to improve hallucination self-detection. Specifically, we propose a simple yet powerful approach that enriches hallucination self-detection by (i) converting LLM responses into knowledge graphs of entities and relations, and (ii) using these graphs to estimate the likelihood that a response contains hallucinations. We evaluate the proposed approach using two widely used LLMs, GPT-4o and Gemini-2.5-Flash, across two hallucination detection datasets. To support more reliable future benchmarking, one of these datasets has been manually curated and enhanced and is released as a secondary outcome of this work. Compared to standard self-detection methods and SelfCheckGPT, a state-of-the-art approach, our method achieves up to 16% relative improvement in accuracy and 20% in F1-score. Our results show that LLMs can better analyse atomic facts when they are structured as knowledge graphs, even when initial outputs contain inaccuracies. This low-cost, model-agnostic approach paves the way toward safer and more trustworthy language models.
Problem

Research questions and friction points this paper is trying to address.

Improves hallucination self-detection in LLMs using knowledge graphs.
Converts LLM responses into entity-relation graphs for fact verification.
Enhances detection accuracy and F1-score over standard self-detection methods.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses knowledge graphs to structure LLM responses for analysis
Estimates hallucination likelihood via entity-relation graph representations
Achieves improved accuracy with model-agnostic, low-cost method
๐Ÿ”Ž Similar Papers
No similar papers found.