🤖 AI Summary
Standard Gaussian process (GP) approximations fail to capture heavy-tailed uncertainty in Bayesian neural networks (BNNs) under the infinite-width limit. Method: The authors rigorously analyze the asymptotic behavior of BNN posteriors, assuming Gaussian–inverse-Gamma priors over weights. Contribution/Results: They establish, for the first time, that the BNN posterior converges to a Student-t process (TP)—not a GP—in the infinite-width limit. This convergence is quantified in the Wasserstein metric, with an explicit convergence rate derived. The result formally establishes asymptotic equivalence between BNN posteriors and TPs, providing a theoretically grounded, robust, and flexible framework for uncertainty quantification in deep Bayesian models. Crucially, the TP approximation enhances resilience to outliers and distributional shift compared to GP-based alternatives.
📝 Abstract
The asymptotic properties of Bayesian Neural Networks (BNNs) have been extensively studied, particularly regarding their approximations by Gaussian processes in the infinite-width limit. We extend these results by showing that posterior BNNs can be approximated by Student-t processes, which offer greater flexibility in modeling uncertainty. Specifically, we show that, if the parameters of a BNN follow a Gaussian prior distribution, and the variance of both the last hidden layer and the Gaussian likelihood function follows an Inverse-Gamma prior distribution, then the resulting posterior BNN converges to a Student-t process in the infinite-width limit. Our proof leverages the Wasserstein metric to establish control over the convergence rate of the Student-t process approximation.