Regularized KL-Divergence for Well-Defined Function-Space Variational Inference in Bayesian neural networks

📅 2024-06-06
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Function-space variational inference (VI) in Bayesian neural networks (BNNs) avoids the challenge of specifying weight priors but suffers from an ill-posed evidence lower bound (ELBO), which Burt et al. (2020) showed often evaluates to negative infinity. Method: We propose the first well-defined function-space VI objective based on a regularized KL divergence, rigorously supporting Gaussian process (GP) priors and resolving the ELBO’s undefinedness. Our approach unifies generalized VI with function-space modeling, ensuring both theoretical soundness and computational tractability. Contribution/Results: Experiments on synthetic and small-scale real-world datasets demonstrate that our method faithfully recovers GP prior behavior. In regression, classification, and out-of-distribution detection, it yields significantly more calibrated uncertainty estimates than both weight-space VI and existing function-space VI baselines.

Technology Category

Application Category

📝 Abstract
Bayesian neural networks (BNN) promise to combine the predictive performance of neural networks with principled uncertainty modeling important for safety-critical systems and decision making. However, posterior uncertainty estimates depend on the choice of prior, and finding informative priors in weight-space has proven difficult. This has motivated variational inference (VI) methods that pose priors directly on the function generated by the BNN rather than on weights. In this paper, we address a fundamental issue with such function-space VI approaches pointed out by Burt et al. (2020), who showed that the objective function (ELBO) is negative infinite for most priors of interest. Our solution builds on generalized VI (Knoblauch et al., 2019) with the regularized KL divergence (Quang, 2019) and is, to the best of our knowledge, the first well-defined variational objective for function-space inference in BNNs with Gaussian process (GP) priors. Experiments show that our method incorporates the properties specified by the GP prior on synthetic and small real-world data sets, and provides competitive uncertainty estimates for regression, classification and out-of-distribution detection compared to BNN baselines with both function and weight-space priors.
Problem

Research questions and friction points this paper is trying to address.

Addresses infinite ELBO in function-space VI for BNNs
Proposes regularized KL divergence for well-defined variational objectives
Enables Gaussian process priors in function-space BNN inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Regularized KL divergence for stable inference
Function-space priors over weight-space priors
Gaussian process priors in Bayesian networks