Scalable Mean-Field Variational Inference via Preconditioned Primal-Dual Optimization

📅 2026-02-07

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the challenges of slow convergence and numerical instability commonly encountered in large-scale mean-field variational inference under non-conjugate and high-dimensional settings. The authors reformulate the problem as a constrained finite-sum optimization task and propose PD-VI, a primal-dual algorithm based on the augmented Lagrangian framework. To further enhance efficiency, they introduce P²D-VI, which incorporates block preconditioning to account for the distinct geometric structures of different parameter blocks, enabling effective joint updates. This study is the first to integrate primal-dual optimization with block preconditioning for variational inference, eliminating the need for conjugacy assumptions or explicit bounded-variance conditions. The method enjoys a general O(1/T) convergence rate and achieves linear convergence under strong convexity. Experiments demonstrate significant improvements over existing stochastic variational inference algorithms on both synthetic and large-scale spatial transcriptomics datasets.

Technology Category

Application Category

📝 Abstract

In this work, we investigate the large-scale mean-field variational inference (MFVI) problem from a mini-batch primal-dual perspective. By reformulating MFVI as a constrained finite-sum problem, we develop a novel primal-dual algorithm based on an augmented Lagrangian formulation, termed primal-dual variational inference (PD-VI). PD-VI jointly updates global and local variational parameters in the evidence lower bound in a scalable manner. To further account for heterogeneous loss geometry across different variational parameter blocks, we introduce a block-preconditioned extension, P$^2$D-VI, which adapts the primal-dual updates to the geometry of each parameter block and improves both numerical robustness and practical efficiency. We establish convergence guarantees for both PD-VI and P$^2$D-VI under properly chosen constant step size, without relying on conjugacy assumptions or explicit bounded-variance conditions. In particular, we prove $O(1/T)$ convergence to a stationary point in general settings and linear convergence under strong convexity. Numerical experiments on synthetic data and a real large-scale spatial transcriptomics dataset demonstrate that our methods consistently outperform existing stochastic variational inference approaches in terms of convergence speed and solution quality.

Problem

Research questions and friction points this paper is trying to address.

mean-field variational inference

scalable inference

primal-dual optimization

large-scale Bayesian inference

heterogeneous parameter geometry

Innovation

Methods, ideas, or system contributions that make the work stand out.

primal-dual optimization

mean-field variational inference

preconditioning