SGD for Variational Inference: Tackling Unbounded Variance via Preconditioning and Dynamic Batching

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

205K/year
🤖 AI Summary
This work addresses the challenge in black-box variational inference (BBVI) where stochastic gradients exhibit unbounded variance and only satisfy the relatively weak Blum–Gladyshev (BG) condition, rendering conventional optimization theory inapplicable. Focusing on the elliptical location-scale family of distributions, the authors propose a minibatch projected stochastic gradient descent method that integrates preconditioning with dynamic batching. Under the BG condition, they establish, for the first time, the rigorous existence of an evidence lower bound (ELBO) maximizer and provide both finite-time and asymptotic convergence guarantees for the algorithm. Theoretical analysis and empirical experiments demonstrate that the proposed approach significantly enhances the stability and effectiveness of BBVI in settings with unbounded gradient variance.
📝 Abstract
Black-Box Variational Inference (BBVI) typically relies on Stochastic Gradient Descent (SGD) to optimize the Evidence Lower Bound (ELBO). However, the stochastic gradients in BBVI inherently exhibit unbounded variance, violating standard assumptions and instead satisfying the weaker Blum-Gladyshev (BG) condition, where variance grows quadratically with distance from the optimum. In this paper, we bridge the gap between stochastic optimization theory and the practical instances of BBVI. Focusing on the broad elliptic location-scale family of parameterized distributions, we offer two main contributions. First, we prove the existence of an ELBO solution, a foundational property usually assumed a priori in the literature. Second, we establish comprehensive convergence guarantees spanning finite-time and asymptotic regimes for Minibatch Projected SGD (PSGD) equipped with dynamic batching and preconditioning under the BG condition. Our theoretical framework demonstrates that dynamic batching combined with preconditioning systematically enables rigorous guarantees even in complex settings. We illustrate our theoretical findings with numerical results, highlighting the efficacy of our approach for modern inference tasks.
Problem

Research questions and friction points this paper is trying to address.

Black-Box Variational Inference
Unbounded Variance
Stochastic Gradient Descent
Blum-Gladyshev Condition
Convergence Guarantees
Innovation

Methods, ideas, or system contributions that make the work stand out.

Black-Box Variational Inference
Unbounded Variance
Preconditioning
Dynamic Batching
Stochastic Gradient Descent