Mirror Descent Under Generalized Smoothness

πŸ“… 2025-02-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing smoothness definitions and convergence guarantees for nonsmooth machine learning objectives are inadequate in non-Euclidean spaces, where classical Euclidean norms fail to capture intrinsic geometric structure. Method: We introduce β„“*-smoothnessβ€”a novel smoothness notion defined with respect to arbitrary norm pairs (not only the Euclidean norm)β€”to characterize local curvature of objective functions, and propose a generalized self-bounding property. Contribution/Results: Building on this framework, we establish the first universal convergence theory for mirror descent-type algorithms under both deterministic and stochastic settings: (i) under β„“*-smoothness, deterministic mirror descent achieves optimal convergence rates matching those of classical smooth optimization; (ii) under a bounded noise condition, stochastic mirror descent attains anytime convergence guarantees. Our work unifies and extends nonsmooth optimization theory by generalizing smoothness to arbitrary convex geometries, thereby providing a rigorous foundation for structured learning problems in non-Euclidean spaces.

Technology Category

Application Category

πŸ“ Abstract
Smoothness is crucial for attaining fast rates in first-order optimization. However, many optimization problems in modern machine learning involve non-smooth objectives. Recent studies relax the smoothness assumption by allowing the Lipschitz constant of the gradient to grow with respect to the gradient norm, which accommodates a broad range of objectives in practice. Despite this progress, existing generalizations of smoothness are restricted to Euclidean geometry with $ell_2$-norm and only have theoretical guarantees for optimization in the Euclidean space. In this paper, we address this limitation by introducing a new $ell*$-smoothness concept that measures the norm of Hessian in terms of a general norm and its dual, and establish convergence for mirror-descent-type algorithms, matching the rates under the classic smoothness. Notably, we propose a generalized self-bounding property that facilitates bounding the gradients via controlling suboptimality gaps, serving as a principal component for convergence analysis. Beyond deterministic optimization, we establish an anytime convergence for stochastic mirror descent based on a new bounded noise condition that encompasses the widely adopted bounded or affine noise assumptions.
Problem

Research questions and friction points this paper is trying to address.

Non-smooth Optimization
Machine Learning
Convergence Guarantee
Innovation

Methods, ideas, or system contributions that make the work stand out.

ell-smoothness
generalized self-concordance
stochastic mirror descent robustness
πŸ”Ž Similar Papers
No similar papers found.
Dingzhi Yu
Dingzhi Yu
Nanjing University
Machine LearningStochastic OptimizationOnline Learning
W
Wei Jiang
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China; School of Artificial Intelligence, Nanjing University, Nanjing 210023, China
Yuanyu Wan
Yuanyu Wan
Zhejiang University
Machine LearningOnline LearningDistributed Optimization
L
Lijun Zhang
National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China; School of Artificial Intelligence, Nanjing University, Nanjing 210023, China