🤖 AI Summary
This paper addresses pervasive overfitting and poor calibration in deep neural networks for image classification, out-of-distribution (OOD) detection, and transfer learning. We propose a temperature-controlled deep ensemble method: first, we employ stochastic gradient Hamiltonian Monte Carlo (SGHMC) as the proposal mechanism within sequential Monte Carlo (SMC) to enable efficient Bayesian weight sampling under mini-batch settings; second, we introduce a temperature annealing strategy to explicitly control posterior contraction, thereby mitigating overfitting and improving uncertainty estimation. The method requires no architectural modifications and enables plug-and-play Bayesianization of pre-trained models. Experiments demonstrate significant improvements over SGD and standard deep ensembles in calibration error, OOD detection AUC, and transfer generalization across multiple benchmarks—achieving superior uncertainty quantification quality and robustness.
📝 Abstract
Sequential Monte Carlo (SMC) methods offer a principled approach to Bayesian uncertainty quantification but are traditionally limited by the need for full-batch gradient evaluations. We introduce a scalable variant by incorporating Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) proposals into SMC, enabling efficient mini-batch based sampling. Our resulting SMCSGHMC algorithm outperforms standard stochastic gradient descent (SGD) and deep ensembles across image classification, out-of-distribution (OOD) detection, and transfer learning tasks. We further show that SMCSGHMC mitigates overfitting and improves calibration, providing a flexible, scalable pathway for converting pretrained neural networks into well-calibrated Bayesian models.