RePPL: Recalibrating Perplexity by Uncertainty in Semantic Propagation and Language Generation for Explainable QA Hallucination Detection

📅 2025-05-21

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Hallucination attribution remains challenging in large language model (LLM) question-answering due to the difficulty of tracing spurious outputs to specific input tokens. Method: This paper proposes the first explainable hallucination detection framework integrating dual-path uncertainty modeling—semantic propagation (via dynamic attention fusion) and linguistic generation (via probabilistic sampling). It introduces a token-level uncertainty scoring mechanism attributable to input tokens, augmented by log-average perplexity reweighting and hierarchical semantic uncertainty estimation. Contribution/Results: Evaluated across multiple QA benchmarks, the framework achieves a state-of-the-art average AUC of 0.833. It is the first to enable fine-grained visualization of hallucination triggers via token-level uncertainty heatmaps, supporting diagnostic attribution and root-cause analysis of hallucinated generations.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have become powerful, but hallucinations remain a vital obstacle to their trustworthy use. While previous works improved the capability of hallucination detection by measuring uncertainty, they all lack the ability to explain the provenance behind why hallucinations occur, i.e., which part of the inputs tends to trigger hallucinations. Recent works on the prompt attack indicate that uncertainty exists in semantic propagation, where attention mechanisms gradually fuse local token information into high-level semantics across layers. Meanwhile, uncertainty also emerges in language generation, due to its probability-based selection of high-level semantics for sampled generations. Based on that, we propose RePPL to recalibrate uncertainty measurement by these two aspects, which dispatches explainable uncertainty scores to each token and aggregates in Perplexity-style Log-Average form as total score. Experiments show that our method achieves the best comprehensive detection performance across various QA datasets on advanced models (average AUC of 0.833), and our method is capable of producing token-level uncertainty scores as explanations for the hallucination. Leveraging these scores, we preliminarily find the chaotic pattern of hallucination and showcase its promising usage.

Problem

Research questions and friction points this paper is trying to address.

Detects hallucinations in LLMs by analyzing token-level uncertainty

Explains hallucination triggers via semantic propagation and generation uncertainty

Recalibrates perplexity scores for improved QA hallucination detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Recalibrates uncertainty via semantic propagation

Measures token-level uncertainty for explanations

Aggregates scores in Perplexity-style Log-Average form

🔎 Similar Papers

Semantically Diverse Language Generation for Uncertainty Estimation in Language Models