Towards E-Value Based Stopping Rules for Bayesian Deep Ensembles

📅 2026-04-20

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This study addresses the challenge of determining when to terminate Markov chain Monte Carlo (MCMC) sampling in Bayesian deep ensembles, aiming to preserve performance gains while avoiding redundant computation. To this end, the authors propose an anytime-valid sequential hypothesis testing framework based on e-values, which models the multi-chain MCMC sampling process as a continual comparison against an initial deep ensemble baseline. Early stopping is triggered as soon as the e-value provides statistically significant evidence that the sampled ensemble outperforms the baseline. This work introduces e-value theory into Bayesian deep ensembles for the first time, offering a statistically rigorous and computationally efficient stopping mechanism. Experimental results demonstrate that the method achieves performance comparable to full-budget sampling using only a small fraction of the computational resources, substantially improving efficiency.

Technology Category

Application Category

📝 Abstract

Bayesian Deep Ensembles (BDEs) represent a powerful approach for uncertainty quantification in deep learning, combining the robustness of Deep Ensembles (DEs) with flexible multi-chain MCMC. While DEs are affordable in most deep learning settings, (long) sampling of Bayesian neural networks can be prohibitively costly. Yet, adding sampling after optimizing the DEs has been shown to yield significant improvements. This leaves a critical practical question: How long should the sequential sampling process continue to yield significant improvements over the initial optimized DE baseline? To tackle this question, we propose a stopping rule based on E-values. We formulate the ensemble construction as a sequential anytime-valid hypothesis test, providing a principled way to decide whether or not to reject the null hypothesis that MCMC offers no improvement over a strong baseline, to early stop the sampling. Empirically, we study this approach for diverse settings. Our results demonstrate the efficacy of our approach and reveal that only a fraction of the full-chain budget is often required.

Problem

Research questions and friction points this paper is trying to address.

Bayesian Deep Ensembles

stopping rule

MCMC sampling

uncertainty quantification

E-value

Innovation

Methods, ideas, or system contributions that make the work stand out.

E-values

Bayesian Deep Ensembles

stopping rule