Regularized KL-Divergence for Well-Defined Function-Space Variational Inference in Bayesian neural networks

📅 2024-06-06

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

197K/year

🤖 AI Summary

Function-space variational inference (VI) in Bayesian neural networks (BNNs) avoids the challenge of specifying weight priors but suffers from an ill-posed evidence lower bound (ELBO), which Burt et al. (2020) showed often evaluates to negative infinity. Method: We propose the first well-defined function-space VI objective based on a regularized KL divergence, rigorously supporting Gaussian process (GP) priors and resolving the ELBO’s undefinedness. Our approach unifies generalized VI with function-space modeling, ensuring both theoretical soundness and computational tractability. Contribution/Results: Experiments on synthetic and small-scale real-world datasets demonstrate that our method faithfully recovers GP prior behavior. In regression, classification, and out-of-distribution detection, it yields significantly more calibrated uncertainty estimates than both weight-space VI and existing function-space VI baselines.

Technology Category

Application Category

📝 Abstract

Bayesian neural networks (BNN) promise to combine the predictive performance of neural networks with principled uncertainty modeling important for safety-critical systems and decision making. However, posterior uncertainty estimates depend on the choice of prior, and finding informative priors in weight-space has proven difficult. This has motivated variational inference (VI) methods that pose priors directly on the function generated by the BNN rather than on weights. In this paper, we address a fundamental issue with such function-space VI approaches pointed out by Burt et al. (2020), who showed that the objective function (ELBO) is negative infinite for most priors of interest. Our solution builds on generalized VI (Knoblauch et al., 2019) with the regularized KL divergence (Quang, 2019) and is, to the best of our knowledge, the first well-defined variational objective for function-space inference in BNNs with Gaussian process (GP) priors. Experiments show that our method incorporates the properties specified by the GP prior on synthetic and small real-world data sets, and provides competitive uncertainty estimates for regression, classification and out-of-distribution detection compared to BNN baselines with both function and weight-space priors.

Problem

Research questions and friction points this paper is trying to address.

Addresses infinite ELBO in function-space VI for BNNs

Proposes regularized KL divergence for well-defined variational objectives

Enables Gaussian process priors in function-space BNN inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Regularized KL divergence for stable inference

Function-space priors over weight-space priors

Gaussian process priors in Bayesian networks

🔎 Similar Papers

A Primer on Variational Inference for Physics-Informed Deep Generative Modelling