Spurious Stationarity and Hardness Results for Mirror Descent

📅 2024-04-11

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Bregman proximal-type algorithms (e.g., mirror descent) suffer from a fundamental flaw under non-Lipschitz-gradient conditions: even for convex problems, they may converge to “spurious stationary points”—points satisfying standard Bregman stationarity criteria yet failing to be true stationary points—and cannot escape such points in finite iterations. Method: This phenomenon arises from intrinsic geometric discrepancies between Bregman and Euclidean geometries in descent behavior, causing conventional Bregman-divergence-based stationarity measures to become invalid and yield misleading convergence signals. Contribution/Results: We provide the first rigorous proof that all existing Bregman stationarity measures inevitably admit spurious stationary points; establish an algorithm-agnostic lower bound on escape difficulty, exposing inherent limitations of Bregman geometry; and thereby delineate new theoretical boundaries for the reliability of Bregman methods in convex optimization. Our findings call for the development of geometry-aware convergence certification mechanisms.

Technology Category

Application Category

📝 Abstract

Despite the considerable success of Bregman proximal-type algorithms, such as mirror descent, in machine learning, a critical question remains: Can existing stationarity measures, often based on Bregman divergence, reliably distinguish between stationary and non-stationary points? In this paper, we present a groundbreaking finding: All existing stationarity measures necessarily imply the existence of spurious stationary points. We further establish an algorithmic independent hardness result: Bregman proximal-type algorithms are unable to escape from a spurious stationary point in finite steps when the initial point is unfavorable, even for convex problems. Our hardness result points out the inherent distinction between Euclidean and Bregman geometries, and introduces both fundamental theoretical and numerical challenges to both machine learning and optimization communities.

Problem

Research questions and friction points this paper is trying to address.

Bregman proximal algorithms stagnate near non-stationary points

Spurious stationarity misleads convergence in nonconvex problems

Non-Lipschitz Bregman kernels cause slow descent near stagnation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Identifies spurious stationary points in Bregman algorithms

Reveals non-Lipschitz gradient causes stagnation issues

Proposes need for new Bregman convergence safeguards

🔎 Similar Papers

Functional Acceleration for Policy Mirror Descent