Provably Scalable Black-Box Variational Inference with Structured Variational Families

📅 2024-01-19
🏛️ International Conference on Machine Learning
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
To address the $mathcal{O}(N^2)$ per-iteration complexity of full-rank covariance approximations in black-box variational inference (BBVI) for hierarchical Bayesian models—hindering scalability to large-scale data—this paper introduces structured variational families, including low-rank-plus-diagonal and block-diagonal scale matrices. These structures preserve expressive modeling of local latent variables while reducing computational complexity to $mathcal{O}(N)$. We provide the first rigorous theoretical proof that specific structured approximations achieve this linear complexity, bridging the theoretical gap between mean-field and full-rank variational families. Leveraging stochastic gradient optimization and convergence analysis, we empirically validate the approach on large-scale hierarchical models, demonstrating significant speedups—multiple times faster than full-rank BBVI—while maintaining high accuracy, comparable to or exceeding that of mean-field inference.

Technology Category

Application Category

📝 Abstract
Variational families with full-rank covariance approximations are known not to work well in black-box variational inference (BBVI), both empirically and theoretically. In fact, recent computational complexity results for BBVI have established that full-rank variational families scale poorly with the dimensionality of the problem compared to e.g. mean-field families. This is particularly critical to hierarchical Bayesian models with local variables; their dimensionality increases with the size of the datasets. Consequently, one gets an iteration complexity with an explicit $mathcal{O}(N^2)$ dependence on the dataset size $N$. In this paper, we explore a theoretical middle ground between mean-field variational families and full-rank families: structured variational families. We rigorously prove that certain scale matrix structures can achieve a better iteration complexity of $mathcal{O}left(N ight)$, implying better scaling with respect to $N$. We empirically verify our theoretical results on large-scale hierarchical models.
Problem

Research questions and friction points this paper is trying to address.

Full-rank variational families scale poorly in high dimensions
Hierarchical models show O(N²) complexity with dataset size
Structured variational families achieve improved O(N) scaling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured variational families replace full-rank approximations
Scale matrix structures reduce iteration complexity to O(N)
Method improves scalability for hierarchical Bayesian models
🔎 Similar Papers
2023-07-27International Conference on Artificial Intelligence and StatisticsCitations: 8
J
Joohwan Ko
KAIST, Daejeon, South Korea, Republic of
Kyurae Kim
Kyurae Kim
PhD Student, University of Pennsylvania
Bayesian inferencestochastic optimizationmachine learningsignal processing
W
W. Kim
KAIST, Daejeon, South Korea, Republic of
J
Jacob R. Gardner
Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, PA, U.S.A.