Humble your Overconfident Networks: Unlearning Overfitting via Sequential Monte Carlo Tempered Deep Ensembles

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This paper addresses pervasive overfitting and poor calibration in deep neural networks for image classification, out-of-distribution (OOD) detection, and transfer learning. We propose a temperature-controlled deep ensemble method: first, we employ stochastic gradient Hamiltonian Monte Carlo (SGHMC) as the proposal mechanism within sequential Monte Carlo (SMC) to enable efficient Bayesian weight sampling under mini-batch settings; second, we introduce a temperature annealing strategy to explicitly control posterior contraction, thereby mitigating overfitting and improving uncertainty estimation. The method requires no architectural modifications and enables plug-and-play Bayesianization of pre-trained models. Experiments demonstrate significant improvements over SGD and standard deep ensembles in calibration error, OOD detection AUC, and transfer generalization across multiple benchmarks—achieving superior uncertainty quantification quality and robustness.

Technology Category

Application Category

📝 Abstract

Sequential Monte Carlo (SMC) methods offer a principled approach to Bayesian uncertainty quantification but are traditionally limited by the need for full-batch gradient evaluations. We introduce a scalable variant by incorporating Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) proposals into SMC, enabling efficient mini-batch based sampling. Our resulting SMCSGHMC algorithm outperforms standard stochastic gradient descent (SGD) and deep ensembles across image classification, out-of-distribution (OOD) detection, and transfer learning tasks. We further show that SMCSGHMC mitigates overfitting and improves calibration, providing a flexible, scalable pathway for converting pretrained neural networks into well-calibrated Bayesian models.

Problem

Research questions and friction points this paper is trying to address.

Mitigating overfitting in neural networks

Improving model calibration and uncertainty quantification

Enabling scalable Bayesian conversion of pretrained models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses SGHMC proposals for scalable SMC

Enables efficient mini-batch based sampling

Converts pretrained networks into Bayesian models

🔎 Similar Papers

(Implicit) Ensembles of Ensembles: Epistemic Uncertainty Collapse in Large Models