๐ค AI Summary
Graph Neural Networks (GNNs) commonly suffer from underconfident predictions, undermining decision reliability; existing calibration methods rely on auxiliary modules, lack theoretical grounding, and incur additional computational overhead.
Method: We propose Terminal-layer Unified Calibration (TUC), a parameter- and architecture-free framework. We first reveal that terminal-layer confidence is jointly governed by class-center-level and node-level calibration. Theoretically, we show that reducing terminal-layer weight decay mitigates underconfidence, while a node-level distance constraint pulls test nodes closer to their predicted class centers. TUC models class centers directly from terminal-layer representations and jointly optimizes weight decay and node-to-center distances.
Results: On multiple benchmark datasets, TUC significantly reduces Expected Calibration Error (ECE), achieving an average improvement of 28.6% over prior state-of-the-art methodsโwhile introducing zero additional parameters and incurring negligible computational overhead.
๐ Abstract
Graph Neural Networks (GNNs) have demonstrated remarkable effectiveness on graph-based tasks. However, their predictive confidence is often miscalibrated, typically exhibiting under-confidence, which harms the reliability of their decisions. Existing calibration methods for GNNs normally introduce additional calibration components, which fail to capture the intrinsic relationship between the model and the prediction confidence, resulting in limited theoretical guarantees and increased computational overhead. To address this issue, we propose a simple yet efficient graph calibration method. We establish a unified theoretical framework revealing that model confidence is jointly governed by class-centroid-level and node-level calibration at the final layer. Based on this insight, we theoretically show that reducing the weight decay of the final-layer parameters alleviates GNN under-confidence by acting on the class-centroid level, while node-level calibration acts as a finer-grained complement to class-centroid level calibration, which encourages each test node to be closer to its predicted class centroid at the final-layer representations. Extensive experiments validate the superiority of our method.