RePPL: Recalibrating Perplexity by Uncertainty in Semantic Propagation and Language Generation for Explainable QA Hallucination Detection

📅 2025-05-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Hallucination attribution remains challenging in large language model (LLM) question-answering due to the difficulty of tracing spurious outputs to specific input tokens. Method: This paper proposes the first explainable hallucination detection framework integrating dual-path uncertainty modeling—semantic propagation (via dynamic attention fusion) and linguistic generation (via probabilistic sampling). It introduces a token-level uncertainty scoring mechanism attributable to input tokens, augmented by log-average perplexity reweighting and hierarchical semantic uncertainty estimation. Contribution/Results: Evaluated across multiple QA benchmarks, the framework achieves a state-of-the-art average AUC of 0.833. It is the first to enable fine-grained visualization of hallucination triggers via token-level uncertainty heatmaps, supporting diagnostic attribution and root-cause analysis of hallucinated generations.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have become powerful, but hallucinations remain a vital obstacle to their trustworthy use. While previous works improved the capability of hallucination detection by measuring uncertainty, they all lack the ability to explain the provenance behind why hallucinations occur, i.e., which part of the inputs tends to trigger hallucinations. Recent works on the prompt attack indicate that uncertainty exists in semantic propagation, where attention mechanisms gradually fuse local token information into high-level semantics across layers. Meanwhile, uncertainty also emerges in language generation, due to its probability-based selection of high-level semantics for sampled generations. Based on that, we propose RePPL to recalibrate uncertainty measurement by these two aspects, which dispatches explainable uncertainty scores to each token and aggregates in Perplexity-style Log-Average form as total score. Experiments show that our method achieves the best comprehensive detection performance across various QA datasets on advanced models (average AUC of 0.833), and our method is capable of producing token-level uncertainty scores as explanations for the hallucination. Leveraging these scores, we preliminarily find the chaotic pattern of hallucination and showcase its promising usage.
Problem

Research questions and friction points this paper is trying to address.

Detects hallucinations in LLMs by analyzing token-level uncertainty
Explains hallucination triggers via semantic propagation and generation uncertainty
Recalibrates perplexity scores for improved QA hallucination detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Recalibrates uncertainty via semantic propagation
Measures token-level uncertainty for explanations
Aggregates scores in Perplexity-style Log-Average form
🔎 Similar Papers
No similar papers found.
Y
Yiming Huang
The Hong Kong University of Science and Technology, Guangzhou
Junyan Zhang
Junyan Zhang
National University of Singapore
Large Language Model
Z
Zihao Wang
The Hong Kong University of Science and Technology
B
Biquan Bie
Independent Researcher
Xuming Hu
Xuming Hu
Assistant Professor, HKUST(GZ) / HKUST
Natural Language ProcessingLarge Language Model
Y
Yi R. Fung
The Hong Kong University of Science and Technology
Xinlei He
Xinlei He
Assistant Professor, HKUST(GZ)
Trustworthy Machine LearningSecurityPrivacy