Scalable Mean-Field Variational Inference via Preconditioned Primal-Dual Optimization

šŸ“… 2026-02-07
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
This work addresses the challenges of slow convergence and numerical instability commonly encountered in large-scale mean-field variational inference under non-conjugate and high-dimensional settings. The authors reformulate the problem as a constrained finite-sum optimization task and propose PD-VI, a primal-dual algorithm based on the augmented Lagrangian framework. To further enhance efficiency, they introduce P²D-VI, which incorporates block preconditioning to account for the distinct geometric structures of different parameter blocks, enabling effective joint updates. This study is the first to integrate primal-dual optimization with block preconditioning for variational inference, eliminating the need for conjugacy assumptions or explicit bounded-variance conditions. The method enjoys a general O(1/T) convergence rate and achieves linear convergence under strong convexity. Experiments demonstrate significant improvements over existing stochastic variational inference algorithms on both synthetic and large-scale spatial transcriptomics datasets.

Technology Category

Application Category

šŸ“ Abstract
In this work, we investigate the large-scale mean-field variational inference (MFVI) problem from a mini-batch primal-dual perspective. By reformulating MFVI as a constrained finite-sum problem, we develop a novel primal-dual algorithm based on an augmented Lagrangian formulation, termed primal-dual variational inference (PD-VI). PD-VI jointly updates global and local variational parameters in the evidence lower bound in a scalable manner. To further account for heterogeneous loss geometry across different variational parameter blocks, we introduce a block-preconditioned extension, P$^2$D-VI, which adapts the primal-dual updates to the geometry of each parameter block and improves both numerical robustness and practical efficiency. We establish convergence guarantees for both PD-VI and P$^2$D-VI under properly chosen constant step size, without relying on conjugacy assumptions or explicit bounded-variance conditions. In particular, we prove $O(1/T)$ convergence to a stationary point in general settings and linear convergence under strong convexity. Numerical experiments on synthetic data and a real large-scale spatial transcriptomics dataset demonstrate that our methods consistently outperform existing stochastic variational inference approaches in terms of convergence speed and solution quality.
Problem

Research questions and friction points this paper is trying to address.

mean-field variational inference
scalable inference
primal-dual optimization
large-scale Bayesian inference
heterogeneous parameter geometry
Innovation

Methods, ideas, or system contributions that make the work stand out.

primal-dual optimization
mean-field variational inference
preconditioning
scalable inference
augmented Lagrangian
J
Jinhua Lyv
Department of Industrial Engineering and Management Science, Northwestern University
T
Tianmin Yu
Department of Mathematics, Northwestern University
Ying Ma
Ying Ma
Assistant Professor of Biostatistics, Brown University
statistical genetics and genomics
Naichen Shi
Naichen Shi
University of Michigan