Provably Scalable Black-Box Variational Inference with Structured Variational Families

📅 2024-01-19

🏛️ International Conference on Machine Learning

📈 Citations: 5

✨ Influential: 0

career value

243K/year

🤖 AI Summary

To address the $mathcal{O}(N^2)$ per-iteration complexity of full-rank covariance approximations in black-box variational inference (BBVI) for hierarchical Bayesian models—hindering scalability to large-scale data—this paper introduces structured variational families, including low-rank-plus-diagonal and block-diagonal scale matrices. These structures preserve expressive modeling of local latent variables while reducing computational complexity to $mathcal{O}(N)$. We provide the first rigorous theoretical proof that specific structured approximations achieve this linear complexity, bridging the theoretical gap between mean-field and full-rank variational families. Leveraging stochastic gradient optimization and convergence analysis, we empirically validate the approach on large-scale hierarchical models, demonstrating significant speedups—multiple times faster than full-rank BBVI—while maintaining high accuracy, comparable to or exceeding that of mean-field inference.

Technology Category

Application Category

📝 Abstract

Variational families with full-rank covariance approximations are known not to work well in black-box variational inference (BBVI), both empirically and theoretically. In fact, recent computational complexity results for BBVI have established that full-rank variational families scale poorly with the dimensionality of the problem compared to e.g. mean-field families. This is particularly critical to hierarchical Bayesian models with local variables; their dimensionality increases with the size of the datasets. Consequently, one gets an iteration complexity with an explicit $mathcal{O}(N^2)$ dependence on the dataset size $N$. In this paper, we explore a theoretical middle ground between mean-field variational families and full-rank families: structured variational families. We rigorously prove that certain scale matrix structures can achieve a better iteration complexity of $mathcal{O}left(N ight)$, implying better scaling with respect to $N$. We empirically verify our theoretical results on large-scale hierarchical models.

Problem

Research questions and friction points this paper is trying to address.

Full-rank variational families scale poorly in high dimensions

Hierarchical models show O(N²) complexity with dataset size

Structured variational families achieve improved O(N) scaling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured variational families replace full-rank approximations

Scale matrix structures reduce iteration complexity to O(N)

Method improves scalability for hierarchical Bayesian models

🔎 Similar Papers

Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing?