Inversion-Free Natural Gradient Descent on Riemannian Manifolds

📅 2026-04-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Natural gradient methods are computationally expensive in Riemannian manifold parameter spaces due to the explicit inversion of the Fisher information matrix. This work proposes the first inversion-free stochastic natural gradient framework, which efficiently optimizes by online approximating the inverse Fisher matrix and incorporating score vectors via parallel transport across tangent spaces. The method employs a limited-memory strategy and demonstrates significant empirical advantages over Euclidean baselines in variational Bayesian Gaussian approximation and normalizing flow tasks. Theoretically, it guarantees almost sure convergence at a rate of $O(\log s / s^\alpha)$ with $\alpha > 2/3$.
📝 Abstract
The natural gradient method is widely used in statistical optimization, but its standard formulation assumes a Euclidean parameter space. This paper proposes an inversion-free stochastic natural gradient method for probability distributions whose parameters lie on a Riemannian manifold. The manifold setting offers several advantages: one can implicitly enforce parameter constraints such as positive definiteness and orthogonality, ensure parameters are identifiable, or guarantee regularity properties of the objective like geodesic convexity. Building on an intrinsic formulation of the Fisher information matrix (FIM) on a manifold, our method maintains an online approximation of the inverse FIM, which is efficiently updated at quadratic cost using score vectors sampled at successive iterates. In the Riemannian setting, these score vectors belong to different tangent spaces and must be combined using transport operations. We prove almost-sure convergence rates of $O(\log{s}/s^α)$ for the squared distance to the minimizer when the step size exponent $α>2/3$. We also establish almost-sure rates for the approximate FIM, which now accumulates transport-based errors. A limited-memory variant of the algorithm with sub-quadratic storage complexity is proposed. Finally, we demonstrate the effectiveness of our method relative to its Euclidean counterparts on variational Bayes with Gaussian approximations and normalizing flows.
Problem

Research questions and friction points this paper is trying to address.

natural gradient
Riemannian manifold
Fisher information matrix
stochastic optimization
parameter constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Riemannian manifold
natural gradient descent
inversion-free
Fisher information matrix
parallel transport
🔎 Similar Papers
No similar papers found.