π€ AI Summary
This work addresses the under-coverage issue commonly observed in Gaussian variational inference, which stems from its inability to accurately capture the radial profile of complex posterior distributions. To remedy this, the authors propose radVI, an algorithm that enhances the radial structure of Gaussian approximations by introducing a radial transport map in Wasserstein space. Notably, radVI is the first method to integrate Caffarelli-type regularity theory into variational inference, thereby providing theoretical convergence guarantees. The approach functions as a lightweight plug-in module compatible with mean-field Gaussian VI or Laplace approximations. Empirical results demonstrate that radVI substantially improves approximation quality for intricate posteriors while preserving computational efficiency.
π Abstract
In variational inference (VI), the practitioner approximates a high-dimensional distribution $Ο$ with a simple surrogate one, often a (product) Gaussian distribution. However, in many cases of practical interest, Gaussian distributions might not capture the correct radial profile of $Ο$, resulting in poor coverage. In this work, we approach the VI problem from the perspective of optimizing over these radial profiles. Our algorithm radVI is a cheap, effective add-on to many existing VI schemes, such as Gaussian (mean-field) VI and Laplace approximation. We provide theoretical convergence guarantees for our algorithm, owing to recent developments in optimization over the Wasserstein space--the space of probability distributions endowed with the Wasserstein distance--and new regularity properties of radial transport maps in the style of Caffarelli (2000).