🤖 AI Summary
To address the uncertainty arising from ambiguous paragraph utility in Retrieval-Augmented Generation (RAG), this paper proposes a lightweight paragraph utility estimation method that quantifies the actual contribution of retrieved passages to downstream question-answering (QA) models, thereby improving confidence calibration for answer correctness. Our core innovation is the first formulation of paragraph utility estimation as a central mechanism for RAG uncertainty quantification—replacing costly sampling-based approaches with an end-to-end small neural network jointly leveraging information-theoretic metrics (e.g., information entropy). The method integrates seamlessly into standard RAG pipelines without architectural modifications. Empirical evaluation across multiple benchmarks demonstrates significant improvements over traditional information-theoretic baselines and matches or surpasses the performance of expensive sampling-based uncertainty estimation methods. The code and datasets are publicly released.
📝 Abstract
Retrieval augmented Question Answering (QA) helps QA models overcome knowledge gaps by incorporating retrieved evidence, typically a set of passages, alongside the question at test time. Previous studies show that this approach improves QA performance and reduces hallucinations, without, however, assessing whether the retrieved passages are indeed useful at answering correctly. In this work, we propose to quantify the uncertainty of a QA model via estimating the utility of the passages it is provided with. We train a lightweight neural model to predict passage utility for a target QA model and show that while simple information theoretic metrics can predict answer correctness up to a certain extent, our approach efficiently approximates or outperforms more expensive sampling-based methods. Code and data are available at https://github.com/lauhaide/ragu.