π€ AI Summary
Unreliable confidence estimation of large language model (LLM) responses hinders their trustworthy deployment in high-stakes applications. To address this, we propose Graph Self-Consistency (GSC), the first method that formalizes self-consistency as a multi-response consistency graph, where candidate responses serve as nodes. Leveraging graph neural networks (GNNs), GSC performs fine-grained, unsupervised probabilistic correctness assessment of each nodeβenabling label-free confidence calibration. Crucially, GSC transforms the inherently unstructured self-consistency paradigm into a learnable graph representation and realizes end-to-end confidence modeling via GNNs. The approach exhibits strong out-of-domain generalization: it improves calibration across multiple benchmarks, reducing Expected Calibration Error (ECE) by over 35% on average, while maintaining robustness on unseen domain tasks.
π Abstract
Reliable confidence estimation is essential for enhancing the trustworthiness of large language models (LLMs), especially in high-stakes scenarios. Despite its importance, accurately estimating confidence in LLM responses remains a significant challenge. In this work, we propose using an auxiliary learning model to assess response correctness based on the self-consistency of multiple outputs generated by the LLM. Our method builds a consistency graph to represent the agreement among multiple responses and uses a graph neural network (GNN) to estimate the likelihood that each response is correct. Experiments demonstrate that this method has strong calibration performance on various benchmark datasets and generalizes well to out-of-domain cases.