🤖 AI Summary
To address the low efficiency and poor compatibility of epistemic uncertainty quantification in deep learning models under few-shot settings, this paper proposes Delta Variances—a method that estimates epistemic uncertainty for neural networks and their composite functions with only a single backward pass. It establishes the first unified theoretical framework that subsumes major approaches—including Bayesian approximation, gradient variance estimation, and sensitivity analysis—within a single algorithmic family, naturally yielding novel variants with superior performance. Grounded in gradient sensitivity and variance propagation theory, Delta Variances requires no architectural modifications or changes to training procedures, ensuring full compatibility with arbitrary neural networks and simulation systems incorporating neural modules. On complex tasks such as weather simulation, it achieves state-of-the-art uncertainty estimation accuracy while reducing computational overhead by an order of magnitude, significantly enhancing practical applicability.
📝 Abstract
Decision makers may suffer from uncertainty induced by limited data. This may be mitigated by accounting for epistemic uncertainty, which is however challenging to estimate efficiently for large neural networks. To this extent we investigate Delta Variances, a family of algorithms for epistemic uncertainty quantification, that is computationally efficient and convenient to implement. It can be applied to neural networks and more general functions composed of neural networks. As an example we consider a weather simulator with a neural-network-based step function inside -- here Delta Variances empirically obtain competitive results at the cost of a single gradient computation. The approach is convenient as it requires no changes to the neural network architecture or training procedure. We discuss multiple ways to derive Delta Variances theoretically noting that special cases recover popular techniques and present a unified perspective on multiple related methods. Finally we observe that this general perspective gives rise to a natural extension and empirically show its benefit.