Is BatchEnsemble a Single Model? On Calibration and Diversity of Efficient Ensembles

📅 2026-01-23
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical challenge of efficiently obtaining reliable model uncertainty under resource-constrained and low-latency conditions. It presents a systematic evaluation of BatchEnsemble in terms of accuracy, calibration, and out-of-distribution (OOD) detection performance. Through comprehensive empirical analyses—including comparisons with deep ensembles, calibration assessments, and controlled investigations of functional and parameter-space similarity among ensemble members on MNIST—the work reveals, for the first time, that BatchEnsemble members exhibit high homogeneity and lack diversity. Results show that BatchEnsemble performs comparably to a single model on CIFAR-10, CIFAR-10-C, and SVHN, while its members on MNIST are nearly identical, failing to capture the predictive diversity characteristic of true ensembles. These findings cast doubt on BatchEnsemble’s effectiveness as an efficient ensemble method.

Technology Category

Application Category

📝 Abstract
In resource-constrained and low-latency settings, uncertainty estimates must be efficiently obtained. Deep Ensembles provide robust epistemic uncertainty (EU) but require training multiple full-size models. BatchEnsemble aims to deliver ensemble-like EU at far lower parameter and memory cost by applying learned rank-1 perturbations to a shared base network. We show that BatchEnsemble not only underperforms Deep Ensembles but closely tracks a single model baseline in terms of accuracy, calibration and out-of-distribution (OOD) detection on CIFAR10/10C/SVHN. A controlled study on MNIST finds members are near-identical in function and parameter space, indicating limited capacity to realize distinct predictive modes. Thus, BatchEnsemble behaves more like a single model than a true ensemble.
Problem

Research questions and friction points this paper is trying to address.

BatchEnsemble
Deep Ensembles
epistemic uncertainty
model calibration
out-of-distribution detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

BatchEnsemble
Deep Ensembles
epistemic uncertainty
model diversity
calibration
🔎 Similar Papers
2024-10-06arXiv.orgCitations: 0