🤖 AI Summary
Wasserstein gradient flows (WGFs) for nonconvex functional optimization often converge to saddle points and fail to satisfy second-order optimality conditions.
Method: We propose a perturbed WGF algorithm incorporating structured Gaussian process noise, injecting stochastic perturbations along the direction of the smallest eigenvalue of the Wasserstein Hessian to enable saddle-point escape.
Contribution/Results: We establish, for the first time, rigorous convergence of this perturbed flow to second-order stationary points in the Wasserstein space, with an explicit upper bound on computational complexity. Under benign structural assumptions on the objective functional, the algorithm achieves polynomial-time global convergence. Our key innovation lies in coupling Hessian geometric information with stochastic vector field dynamics—yielding the first theoretically grounded, practical framework for second-order optimization over the space of probability measures.
📝 Abstract
Wasserstein gradient flow (WGF) is a common method to perform optimization over the space of probability measures. While WGF is guaranteed to converge to a first-order stationary point, for nonconvex functionals the converged solution does not necessarily satisfy the second-order optimality condition; i.e., it could converge to a saddle point. In this work, we propose a new algorithm for probability measure optimization, perturbed Wasserstein gradient flow (PWGF), that achieves second-order optimality for general nonconvex objectives. PWGF enhances WGF by injecting noisy perturbations near saddle points via a Gaussian process-based scheme. By pushing the measure forward along a random vector field generated from a Gaussian process, PWGF helps the solution escape saddle points efficiently by perturbing the solution towards the smallest eigenvalue direction of the Wasserstein Hessian. We theoretically derive the computational complexity for PWGF to achieve a second-order stationary point. Furthermore, we prove that PWGF converges to a global optimum in polynomial time for strictly benign objectives.