Dual Riemannian Newton Method on Statistical Manifolds

📅 2025-11-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing manifold Newton methods for parameter estimation in probabilistic models neglect the dual affine connection structure of information geometry, leading to slow convergence. To address this, we propose the first second-order optimization method grounded in dual Riemannian geometry. Our core innovation integrates dual affine connections—central to information geometry—into the Newton framework, combining the Fisher–Rao metric with retraction mappings to derive a geometrically aware Riemannian Newton update rule. We theoretically establish local quadratic convergence on statistical manifolds. Empirical evaluation demonstrates substantial acceleration over first-order methods across canonical probabilistic models. This work is the first to systematically reveal the fundamental role of dual connections in optimization dynamics, establishing a new paradigm for efficient, information-geometrically principled parameter learning.

Technology Category

Application Category

📝 Abstract
In probabilistic modeling, parameter estimation is commonly formulated as a minimization problem on a parameter manifold. Optimization in such spaces requires geometry-aware methods that respect the underlying information structure. While the natural gradient leverages the Fisher information metric as a form of Riemannian gradient descent, it remains a first-order method and often exhibits slow convergence near optimal solutions. Existing second-order manifold algorithms typically rely on the Levi-Civita connection, thus overlooking the dual-connection structure that is central to information geometry. We propose the dual Riemannian Newton method, a Newton-type optimization algorithm on manifolds endowed with a metric and a pair of dual affine connections. The dual Riemannian Newton method explicates how duality shapes second-order updates: when the retraction (a local surrogate of the exponential map) is defined by one connection, the associated Newton equation is posed with its dual. We establish local quadratic convergence and validate the theory with experiments on representative statistical models. Thus, the dual Riemannian Newton method thus delivers second-order efficiency while remaining compatible with the dual structures that underlie modern information-geometric learning and inference.
Problem

Research questions and friction points this paper is trying to address.

Develops Newton method using dual connections on statistical manifolds
Addresses slow convergence of first-order natural gradient optimization
Incorporates dual geometry structure overlooked by existing manifold algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual Riemannian Newton method uses metric and dual connections
Newton equation uses dual connection to retraction's connection
Achieves quadratic convergence on statistical manifolds
🔎 Similar Papers
No similar papers found.
D
Derun Zhou
National Institute of Informatics, Tokyo, 101-8430, Japan
K
Keisuke Yano
The Institute of Statistical Mathematics, Tokyo, 190-8562, Japan
Mahito Sugiyama
Mahito Sugiyama
Associate Professor, National Institute of Informatics
Artificial IntelligenceMachine LearningKnowledge DiscoveryData Mining