🤖 AI Summary
Quantifying uncertainty in autoregressive generation by large language models (LLMs) remains challenging due to implicit conditional dependencies across decoding steps, which hinder accurate uncertainty estimation.
Method: We propose a dynamic uncertainty modulation mechanism that explicitly models latent inter-step conditional dependencies. It uses the gap between unconditional and conditional confidence—estimated via regression—as a supervision signal for uncertainty calibration. Furthermore, it integrates multi-step uncertainty propagation with LLM forward inference.
Contribution/Results: Our method consistently outperforms existing uncertainty quantification (UQ) approaches across nine benchmark datasets and three mainstream LLMs. It significantly improves accuracy in hallucination detection and low-quality output identification. By providing an interpretable, computationally tractable framework for uncertainty estimation, our approach establishes a novel paradigm for trustworthy LLM generation.
📝 Abstract
Uncertainty quantification (UQ) is a perspective approach to detecting Large Language Model (LLM) hallucinations and low quality output. In this work, we address one of the challenges of UQ in generation tasks that arises from the conditional dependency between the generation steps of an LLM. We propose to learn this dependency from data. We train a regression model, which target variable is the gap between the conditional and the unconditional generation confidence. During LLM inference, we use this learned conditional dependency model to modulate the uncertainty of the current generation step based on the uncertainty of the previous step. Our experimental evaluation on nine datasets and three LLMs shows that the proposed method is highly effective for uncertainty quantification, achieving substantial improvements over rivaling approaches.