🤖 AI Summary
This paper studies the oracle complexity of finding a (δ,ε)-stable point—a point whose δ-neighborhood contains a subgradient of norm at most ε—in nonsmooth optimization. Under Lipschitz continuity, it establishes for the first time that deterministic first-order algorithms necessarily incur dimension-dependent complexity, precluding dimension-free bounds; in contrast, randomized algorithms achieve the tight upper bound Õ(1/(δε³)), matched by a universal randomized lower bound. It further reveals that convexity dramatically accelerates convergence: for convex functions, a deterministic O(1/ε²) upper bound is attained, and it is proven that zero subgradients cannot be identified exactly in finite time; for smooth functions, derandomization is achievable with only logarithmic overhead. The core contribution lies in precisely characterizing the fundamental roles of determinism vs. randomness, convexity, and smoothness in governing oracle complexity, and providing tight upper and lower bounds for each setting.
📝 Abstract
We study the oracle complexity of producing $(delta,epsilon)$-stationary points of Lipschitz functions, in the sense proposed by Zhang et al. [2020]. While there exist dimension-free randomized algorithms for producing such points within $widetilde{O}(1/deltaepsilon^3)$ first-order oracle calls, we show that no dimension-free rate can be achieved by a deterministic algorithm. On the other hand, we point out that this rate can be derandomized for smooth functions with merely a logarithmic dependence on the smoothness parameter. Moreover, we establish several lower bounds for this task which hold for any randomized algorithm, with or without convexity. Finally, we show how the convergence rate of finding $(delta,epsilon)$-stationary points can be improved in case the function is convex, a setting which we motivate by proving that in general no finite time algorithm can produce points with small subgradients even for convex functions.