From Saddle Points Toward Global Minima: A Newton-Type Method on Wasserstein Space

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the challenge of global optimization for non-convex functionals over Wasserstein space by proposing the Wasserstein Saddle-Free Newton (WSFN) method. WSFN establishes, for the first time, a second-order optimization framework in Wasserstein space, employing a regularized square root of the Wasserstein Hessian to precondition the gradient. This mechanism simultaneously attracts positive-curvature directions and repels negative-curvature ones, enabling efficient escape from saddle points. Under a benign landscape assumption, WSFN reaches a neighborhood of a global minimizer in polynomial time with improved parameter dependence compared to existing perturbed first-order methods, and subsequently converges linearly to a non-degenerate global minimizer. Theoretical contributions include second-order sufficient optimality conditions for strict local minima and an efficient particle-based algorithmic implementation.

📝 Abstract

We study the minimization of non-convex functionals over the Wasserstein space. While recent work has showed that perturbed Wasserstein gradient methods can avoid saddle points for benign landscapes, existing approaches remain essentially first-order and do not provide fast local convergence once the iterates enter a neighborhood of a global minimizer. We propose Wasserstein Saddle-Free Newton (WSFN), a second-order method that preconditions the Wasserstein gradient by a regularized square root of the squared Wasserstein Hessian. This construction preserves attraction toward directions of positive curvature while inducing repulsion along directions of negative curvature, thereby overcoming the tendency of standard Wasserstein Newton dynamics to be attracted to saddles. We also establish second-order sufficient optimality conditions on Wasserstein space for strict local minimality. Under regularity and benign landscape assumptions, we prove that WSFN escapes saddle regions and reaches an $α$-neighborhood of a global minimizer in polynomial time, with improved dependence on saddle parameters compared with prior perturbed first-order methods. Once inside this neighborhood, we show that WSFN converges linearly in $L^2$-Wasserstein distance to a non-degenerate global minimizer. Finally, we present a particle-based implementation of the method.

Problem

Research questions and friction points this paper is trying to address.

non-convex optimization

Wasserstein space

saddle points

global minima

second-order methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Wasserstein space

second-order optimization

saddle-point avoidance