Glocal Smoothness: Line Search can really help!

📅 2025-06-14
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Classical iteration complexity analyses for first-order optimization methods rely on global Lipschitz continuity of the gradient, failing to exploit beneficial local smoothness—where the Lipschitz constant varies across regions—and thus incur unnecessary conservatism. Method: We introduce “glocal smoothness,” a novel structural assumption that simultaneously captures both global and local smoothness properties of the objective function—without dependence on algorithmic trajectories—thereby enabling trajectory-agnostic complexity bounds governed solely by intrinsic function constants. Contribution/Results: Under glocal smoothness, we establish improved iteration complexity for gradient descent with backtracking line search—surpassing that of fixed-step accelerated methods. Moreover, we provide a unified, refined convergence analysis for diverse algorithms including Polyak’s step size, adaptive gradient descent (AdGD), coordinate descent, stochastic and deterministic gradient methods, and nonlinear conjugate gradient, yielding significantly tighter complexity bounds across all cases.

Technology Category

Application Category

📝 Abstract
Iteration complexities for first-order optimization algorithms are typically stated in terms of a global Lipschitz constant of the gradient, and near-optimal results are achieved using fixed step sizes. But many objective functions that arise in practice have regions with small Lipschitz constants where larger step sizes can be used. Many local Lipschitz assumptions have been proposed, which have lead to results showing that adaptive step sizes and/or line searches yield improved convergence rates over fixed step sizes. However, these faster rates tend to depend on the iterates of the algorithm, which makes it difficult to compare the iteration complexities of different methods. We consider a simple characterization of global and local ("glocal") smoothness that only depends on properties of the function. This allows upper bounds on iteration complexities in terms of iterate-independent constants and enables us to compare iteration complexities between algorithms. Under this assumption it is straightforward to show the advantages of line searches over fixed step sizes, and that in some settings, gradient descent with line search has a better iteration complexity than accelerated methods with fixed step sizes. We further show that glocal smoothness can lead to improved complexities for the Polyak and AdGD step sizes, as well other algorithms including coordinate optimization, stochastic gradient methods, accelerated gradient methods, and non-linear conjugate gradient methods.
Problem

Research questions and friction points this paper is trying to address.

Characterizing global and local smoothness for optimization functions
Comparing iteration complexities between different optimization algorithms
Demonstrating advantages of line searches over fixed step sizes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses glocal smoothness for adaptive step sizes
Compares iteration complexities between algorithms
Improves complexities for various optimization methods
🔎 Similar Papers
No similar papers found.
C
Curtis Fox
Department of Computer Science, University of British Columbia, Vancouver, Canada
A
Aaron Mishkin
Department of Computer Science, Stanford University, Stanford, USA
Sharan Vaswani
Sharan Vaswani
Simon Fraser University
Machine LearningOptimizationArtificial Intelligence
Mark Schmidt
Mark Schmidt
Professor of Computer Science, University of British Columbia
Machine LearningOptimization