🤖 AI Summary
To address the instability and computational overhead arising from repeated gradient evaluations in exponential-family variational approximations, this paper proposes gradient-free Least-Squares Variational Inference (LSVI). LSVI reformulates the variational fixed-point equation as a gradient-free least-squares regression problem, thereby decoupling variational inference from gradient computation for the first time. Theoretically, we prove that LSVI is equivalent to stochastic mirror descent and establish rigorous convergence guarantees. For Gaussian approximations, LSVI supports both full-covariance (O(d³)) and mean-field (O(d)) parameterizations, balancing accuracy and efficiency. Empirical results demonstrate that LSVI outperforms state-of-the-art methods across diverse high-dimensional tasks, achieving significant improvements in stability and computational efficiency while strictly preserving its gradient-free property.
📝 Abstract
Variational inference consists in finding the best approximation of a target distribution within a certain family, where `best' means (typically) smallest Kullback-Leiber divergence. We show that, when the approximation family is exponential, the best approximation is the solution of a fixed-point equation. We introduce LSVI (Least-Squares Variational Inference), a Monte Carlo variant of the corresponding fixed-point recursion, where each iteration boils down to ordinary least squares regression and does not require computing gradients. We show that LSVI is equivalent to stochastic mirror descent; we use this insight to derive convergence guarantees. We introduce various ideas to improve LSVI further when the approximation family is Gaussian, leading to a $O(d^3)$ complexity in the dimension $d$ of the target in the full-covariance case, and a $O(d)$ complexity in the mean-field case. We show that LSVI outperforms state-of-the-art methods in a range of examples, while remaining gradient-free, that is, it does not require computing gradients.