(Implicit) Ensembles of Ensembles: Epistemic Uncertainty Collapse in Large Models

📅 2024-09-04

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This paper identifies and explains the “cognitive uncertainty collapse” phenomenon—where larger deep learning models exhibit degraded uncertainty quantification despite increased capacity—challenging the prevailing assumption that scale inherently improves uncertainty estimation. Method: The authors first systematically establish implicit ensembling as the primary cause of this collapse; they then propose an explicit multi-layer ensembling framework coupled with submodel decomposition to restore predictive diversity in large vision models (e.g., ViT), thereby recovering calibrated uncertainty estimation. Their approach integrates ViT interpretability analysis, theoretical modeling, and cross-architecture empirical validation (MLP, ResNet, ViT). Contribution/Results: The collapse is consistently reproduced across architectures; the proposed method significantly improves out-of-distribution detection and uncertainty calibration in safety-critical applications, demonstrating robust generalization and advancing principled uncertainty-aware scaling of vision models.

Technology Category

Application Category

📝 Abstract

Epistemic uncertainty is crucial for safety-critical applications and out-of-distribution detection tasks. Yet, we uncover a paradoxical phenomenon in deep learning models: an epistemic uncertainty collapse as model complexity increases, challenging the assumption that larger models invariably offer better uncertainty quantification. We propose that this stems from implicit ensembling within large models. To support this hypothesis, we demonstrate epistemic uncertainty collapse empirically across various architectures, from explicit ensembles of ensembles and simple MLPs to state-of-the-art vision models, including ResNets and Vision Transformers -- for the latter, we examine implicit ensemble extraction and decompose larger models into diverse sub-models, recovering epistemic uncertainty. We provide theoretical justification for these phenomena and explore their implications for uncertainty estimation.

Problem

Research questions and friction points this paper is trying to address.

Epistemic uncertainty collapse in large deep learning models

Challenging assumption that larger models improve uncertainty quantification

Recovering epistemic uncertainty via implicit ensemble extraction techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Implicit ensembling explains uncertainty collapse

Extract diverse sub-models from large models

Recover epistemic uncertainty via ensemble techniques

🔎 Similar Papers

LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks