Natural Gradient VI: Guarantees for Non-Conjugate Models

📅 2025-10-21

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

Stochastic natural-gradient variational inference (NGVI) lacks theoretical convergence guarantees for non-conjugate models. Method: This paper introduces the concepts of relative smoothness and the mirror descent framework, establishing the first verifiable condition under which the non-conjugate variational objective satisfies relative smoothness and revealing its implicit convex structure. Based on this, we propose an improved algorithm incorporating stochastic natural-gradient updates, mean-field parameterization, and relative strong convexity analysis, equipped with a non-Euclidean projection. Contribution/Results: We prove that the algorithm globally converges to a stationary point under general non-conjugate settings, and achieves an $O(1/t)$ convergence rate to the global optimum under relative strong convexity. This work provides the first NGVI framework for complex posterior approximation with rigorous, provable convergence guarantees.

Technology Category

Application Category

📝 Abstract

Stochastic Natural Gradient Variational Inference (NGVI) is a widely used method for approximating posterior distribution in probabilistic models. Despite its empirical success and foundational role in variational inference, its theoretical underpinnings remain limited, particularly in the case of non-conjugate likelihoods. While NGVI has been shown to be a special instance of Stochastic Mirror Descent, and recent work has provided convergence guarantees using relative smoothness and strong convexity for conjugate models, these results do not extend to the non-conjugate setting, where the variational loss becomes non-convex and harder to analyze. In this work, we focus on mean-field parameterization and advance the theoretical understanding of NGVI in three key directions. First, we derive sufficient conditions under which the variational loss satisfies relative smoothness with respect to a suitable mirror map. Second, leveraging this structure, we propose a modified NGVI algorithm incorporating non-Euclidean projections and prove its global non-asymptotic convergence to a stationary point. Finally, under additional structural assumptions about the likelihood, we uncover hidden convexity properties of the variational loss and establish fast global convergence of NGVI to a global optimum. These results provide new insights into the geometry and convergence behavior of NGVI in challenging inference settings.

Problem

Research questions and friction points this paper is trying to address.

Theoretical guarantees for NGVI in non-conjugate models are limited

Variational loss becomes non-convex and harder to analyze

Geometry and convergence behavior need investigation in challenging settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Establishes relative smoothness conditions for mean-field

Introduces non-Euclidean projections for global convergence

Uncovers hidden convexity for fast global optimization

🔎 Similar Papers

Multiple importance sampling for stochastic gradient estimation