Zero-resource Hallucination Detection for Text Generation via Graph-based Contextual Knowledge Triples Modeling

📅 2024-09-17
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of hallucination detection in open-ended long-text generation by large language models (LLMs), particularly in zero-resource settings—where no external knowledge is available and complex inter-factual dependencies and alignments must be modeled implicitly within the text. To this end, we propose a novel graph-structured detection framework. Methodologically, it introduces context-dependency graph modeling based on knowledge triples, integrating response segmentation, relational graph convolutional network (RGCN)-based message passing, LLM-driven inverse triple reconstruction, and consistency alignment. Crucially, our approach enables end-to-end modeling of implicit factual associations within long texts—without external retrieval. Evaluated across multiple long-text hallucination benchmarks, it significantly outperforms state-of-the-art methods in both accuracy and robustness. This establishes a scalable, zero-resource paradigm for hallucination detection in long-text LLM generation.

Technology Category

Application Category

📝 Abstract
LLMs obtain remarkable performance but suffer from hallucinations. Most research on detecting hallucination focuses on the questions with short and concrete correct answers that are easy to check the faithfulness. Hallucination detections for text generation with open-ended answers are more challenging. Some researchers use external knowledge to detect hallucinations in generated texts, but external resources for specific scenarios are hard to access. Recent studies on detecting hallucinations in long text without external resources conduct consistency comparison among multiple sampled outputs. To handle long texts, researchers split long texts into multiple facts and individually compare the consistency of each pairs of facts. However, these methods (1) hardly achieve alignment among multiple facts; (2) overlook dependencies between multiple contextual facts. In this paper, we propose a graph-based context-aware (GCA) hallucination detection for text generations, which aligns knowledge facts and considers the dependencies between contextual knowledge triples in consistency comparison. Particularly, to align multiple facts, we conduct a triple-oriented response segmentation to extract multiple knowledge triples. To model dependencies among contextual knowledge triple (facts), we construct contextual triple into a graph and enhance triples' interactions via message passing and aggregating via RGCN. To avoid the omission of knowledge triples in long text, we conduct a LLM-based reverse verification via reconstructing the knowledge triples. Experiments show that our model enhances hallucination detection and excels all baselines.
Problem

Research questions and friction points this paper is trying to address.

Detects hallucinations in open-ended text generation without external resources.
Addresses alignment and dependency issues in multiple contextual knowledge triples.
Enhances hallucination detection using graph-based context-aware modeling and reverse verification.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based context-aware hallucination detection
Triple-oriented response segmentation for alignment
LLM-based reverse verification for knowledge triples
X
Xinyue Fang
College of Computer, National University of Defense Technology
Z
Zhen Huang
College of Computer, National University of Defense Technology
Z
Zhiliang Tian
College of Computer, National University of Defense Technology
Minghui Fang
Minghui Fang
Zhejiang University
SpeechMulti-Modal LearningInformation Retrieval
Z
Ziyi Pan
College of Computer, National University of Defense Technology
Q
Quntian Fang
College of Computer, National University of Defense Technology
Z
Zhihua Wen
College of Computer, National University of Defense Technology
H
H. Pan
College of Computer, National University of Defense Technology
D
Dongsheng Li
College of Computer, National University of Defense Technology