🤖 AI Summary
Language models frequently generate unreliable outputs—such as hallucinations—due to the lack of explicit modeling of uncertainty during decoding. To address this, we propose Uncertainty-Aware Minimum Bayes Risk (MBR) decoding, the first method to incorporate the posterior distribution over model parameters into the MBR expected risk computation, thereby enabling explicit modeling and exploitation of predictive uncertainty. Our approach incurs no additional inference overhead and supports both risk-driven optimal output selection and active rejection of high-uncertainty predictions. Extensive experiments across multiple benchmarks demonstrate significant reductions in hallucination rates and substantial improvements in generation reliability. Empirical analysis further confirms a positive correlation between predictive diversity and generation quality. The implementation is publicly available.
📝 Abstract
Despite their outstanding performance in the majority of scenarios, contemporary language models still occasionally generate undesirable outputs, for example, hallucinated text. While such behaviors have previously been linked to uncertainty, there is a notable lack of methods that actively consider uncertainty during text generation. In this work, we show how Minimum Bayes Risk (MBR) decoding, which selects model generations according to an expected risk, can be generalized into a principled uncertainty-aware decoding method. In short, we account for model uncertainty during decoding by incorporating a posterior over model parameters into MBR's computation of expected risk. We show that this modified expected risk is useful for both choosing outputs and deciding when to abstain from generation and can provide improvements without incurring overhead. We benchmark different methods for learning posteriors and show that performance improves with prediction diversity. We release our code publicly.