🤖 AI Summary
This work addresses the inefficiency of cognitive uncertainty estimation in regression ensemble models. We propose Pairwise Distance Estimators (PaiDEs), an efficient uncertainty quantification algorithm that establishes, for the first time, a theoretical connection between pairwise distances among ensemble members and an upper bound on predictive entropy. By directly bounding uncertainty via distance metrics, PaiDEs eliminates the need for computationally expensive Monte Carlo sampling. The method preserves theoretical rigor while achieving up to 100× faster inference, supporting high-dimensional inputs and batched evaluation. Empirically, the framework significantly improves BALD-based active learning: it consistently outperforms state-of-the-art methods on benchmarks including a 1D sine wave and high-dimensional dynamical systems—Pendulum, Hopper, Ant, and Humanoid. Overall, PaiDEs provides a scalable, theoretically grounded paradigm for large-scale regression uncertainty quantification.
📝 Abstract
This work introduces an efficient novel approach for epistemic uncertainty estimation for ensemble models for regression tasks using pairwise-distance estimators (PaiDEs). Utilizing the pairwise-distance between model components, these estimators establish bounds on entropy. We leverage this capability to enhance the performance of Bayesian Active Learning by Disagreement (BALD). Notably, unlike sample-based Monte Carlo estimators, PaiDEs exhibit a remarkable capability to estimate epistemic uncertainty at speeds up to 100 times faster while covering a significantly larger number of inputs at once and demonstrating superior performance in higher dimensions. To validate our approach, we conducted a varied series of regression experiments on commonly used benchmarks: 1D sinusoidal data, $ extit{Pendulum}$, $ extit{Hopper}$, $ extit{Ant}$ and $ extit{Humanoid}$. For each experimental setting, an active learning framework was applied to demonstrate the advantages of PaiDEs for epistemic uncertainty estimation. We compare our approach to existing active learning methods and find that our approach outperforms on high-dimensional regression tasks.