🤖 AI Summary
This work addresses the common trade-off in deep heteroscedastic regression between uncertainty quantification and mean prediction, which often leads to optimization difficulties, representation collapse, and variance overfitting. To circumvent these issues, the authors propose a post-hoc variance modeling approach that fits a heteroscedastic model on an intermediate layer of a pre-trained network using a held-out calibration dataset, without altering the original architecture or requiring end-to-end retraining. This method effectively mitigates the limitations of conventional training strategies, achieving uncertainty quantification performance on par with or superior to state-of-the-art methods across multiple molecular graph datasets, while preserving high predictive accuracy for the mean and incurring minimal additional inference overhead.
📝 Abstract
Uncertainty quantification (UQ) in deep learning regression is of wide interest, as it supports critical applications including sequential decision making and risk-sensitive tasks. In heteroskedastic regression, where the uncertainty of the target depends on the input, a common approach is to train a neural network that parameterizes the mean and the variance of the predictive distribution. Still, training deep heteroskedastic regression models poses practical challenges in the trade-off between uncertainty quantification and mean prediction, such as optimization difficulties, representation collapse, and variance overfitting. In this work we identify previously undiscussed fallacies and propose a simple and efficient procedure that addresses these challenges jointly by post-hoc fitting a variance model across the intermediate layers of a pretrained network on a hold-out dataset. We demonstrate that our method achieves on-par or state-of-the-art uncertainty quantification on several molecular graph datasets, without compromising mean prediction accuracy and remaining cheap to use at prediction time.