Least squares variational inference

📅 2025-02-05

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

To address the instability and computational overhead arising from repeated gradient evaluations in exponential-family variational approximations, this paper proposes gradient-free Least-Squares Variational Inference (LSVI). LSVI reformulates the variational fixed-point equation as a gradient-free least-squares regression problem, thereby decoupling variational inference from gradient computation for the first time. Theoretically, we prove that LSVI is equivalent to stochastic mirror descent and establish rigorous convergence guarantees. For Gaussian approximations, LSVI supports both full-covariance (O(d³)) and mean-field (O(d)) parameterizations, balancing accuracy and efficiency. Empirical results demonstrate that LSVI outperforms state-of-the-art methods across diverse high-dimensional tasks, achieving significant improvements in stability and computational efficiency while strictly preserving its gradient-free property.

Technology Category

Application Category

📝 Abstract

Variational inference consists in finding the best approximation of a target distribution within a certain family, where `best' means (typically) smallest Kullback-Leiber divergence. We show that, when the approximation family is exponential, the best approximation is the solution of a fixed-point equation. We introduce LSVI (Least-Squares Variational Inference), a Monte Carlo variant of the corresponding fixed-point recursion, where each iteration boils down to ordinary least squares regression and does not require computing gradients. We show that LSVI is equivalent to stochastic mirror descent; we use this insight to derive convergence guarantees. We introduce various ideas to improve LSVI further when the approximation family is Gaussian, leading to a $O(d^3)$ complexity in the dimension $d$ of the target in the full-covariance case, and a $O(d)$ complexity in the mean-field case. We show that LSVI outperforms state-of-the-art methods in a range of examples, while remaining gradient-free, that is, it does not require computing gradients.

Problem

Research questions and friction points this paper is trying to address.

Finding best approximation of target distributions using exponential families

Developing gradient-free variational inference via least squares regression

Improving computational efficiency for Gaussian approximations in high dimensions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fixed-point equation solution for exponential families

Gradient-free Monte Carlo using least squares regression

O(d) complexity Gaussian approximations via mirror descent

🔎 Similar Papers

A Primer on Variational Inference for Physics-Informed Deep Generative Modelling