🤖 AI Summary
To address the scalability limitations of full-covariance Gaussian approximations in high-dimensional black-box variational inference (BBVI)—stemming from prohibitive memory and optimization costs—this paper proposes BaM-LR, a three-stage framework integrating batching, score matching, and low-rank patching. The core innovation lies in the first integration of score matching with diagonal-plus-low-rank (D+LR) covariance parameterization, coupled with a dynamic “patching” step that projects gradients onto a compact matrix manifold, thereby avoiding explicit construction and storage of large covariance matrices. Optimization proceeds via stochastic minibatch updates, leveraging factor-analysis-inspired low-rank decomposition. Experiments across multiple high-dimensional synthetic and real-world tasks demonstrate that BaM-LR reduces memory consumption by up to 90% while maintaining or improving approximation accuracy. This work establishes an efficient, robust paradigm for scalable Bayesian inference.
📝 Abstract
Black-box variational inference (BBVI) scales poorly to high-dimensional problems when it is used to estimate a multivariate Gaussian approximation with a full covariance matrix. In this paper, we extend the batch-and-match (BaM) framework for score-based BBVI to problems where it is prohibitively expensive to store such covariance matrices, let alone to estimate them. Unlike classical algorithms for BBVI, which use stochastic gradient descent to minimize the reverse Kullback-Leibler divergence, BaM uses more specialized updates to match the scores of the target density and its Gaussian approximation. We extend the updates for BaM by integrating them with a more compact parameterization of full covariance matrices. In particular, borrowing ideas from factor analysis, we add an extra step to each iteration of BaM--a patch--that projects each newly updated covariance matrix into a more efficiently parameterized family of diagonal plus low rank matrices. We evaluate this approach on a variety of synthetic target distributions and real-world problems in high-dimensional inference.