🤖 AI Summary
To address insufficient uncertainty quantification in Bayesian neural networks (BNNs) and deep ensembles (DEs) for classification, this paper proposes the *credal wrapper*: a novel framework that first maps model-averaged outputs to convex sets of probability distributions (credal sets), explicitly modeling epistemic uncertainty via upper and lower probabilities, and then applies an intersection probability transform to yield a unique, robust prediction. Unlike existing approaches, it unifies uncertainty representation across BNNs and DEs without relying on posterior approximations or assumptions about ensemble diversity. Extensive evaluation on CIFAR and ImageNet benchmarks—using VGG, ResNet, and ViT architectures—demonstrates that the credal wrapper significantly outperforms BNN and DE baselines in out-of-distribution detection and robustness to input perturbations, achieving lower expected calibration error (ECE) and superior uncertainty discrimination.
📝 Abstract
This paper presents an innovative approach, called credal wrapper, to formulating a credal set representation of model averaging for Bayesian neural networks (BNNs) and deep ensembles (DEs), capable of improving uncertainty estimation in classification tasks. Given a finite collection of single predictive distributions derived from BNNs or DEs, the proposed credal wrapper approach extracts an upper and a lower probability bound per class, acknowledging the epistemic uncertainty due to the availability of a limited amount of distributions. Such probability intervals over classes can be mapped on a convex set of probabilities (a credal set) from which, in turn, a unique prediction can be obtained using a transformation called intersection probability transformation. In this article, we conduct extensive experiments on several out-of-distribution (OOD) detection benchmarks, encompassing various dataset pairs (CIFAR10/100 vs SVHN/Tiny-ImageNet, CIFAR10 vs CIFAR10-C, CIFAR100 vs CIFAR100-C and ImageNet vs ImageNet-O) and using different network architectures (such as VGG16, ResNet-18/50, EfficientNet B2, and ViT Base). Compared to the BNN and DE baselines, the proposed credal wrapper method exhibits superior performance in uncertainty estimation and achieves a lower expected calibration error on corrupted data.