Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

📅 2023-02-07

🏛️ International Conference on Machine Learning

📈 Citations: 35

✨ Influential: 9

career value

228K/year

🤖 AI Summary

This paper investigates the computational complexity of finding (δ,ε)-stable points in stochastic nonsmooth nonconvex optimization. To overcome the bottleneck of the prior best-known complexity O(ε⁻⁴δ⁻¹), we establish, for the first time, a rigorous reduction framework from nonsmooth nonconvex optimization to online learning, reformulating the problem as an optimistic online learning task and integrating stochastic subgradient methods with complexity lower-bound analysis. Our contributions are: (1) achieving the optimal stochastic gradient query complexity O(ε⁻³δ⁻¹); (2) deriving a tight lower bound that certifies its theoretical optimality; (3) naturally extending the analysis to the second-order smooth setting, yielding a new complexity bound O(ε⁻¹·⁵δ⁻⁰·⁵); and (4) unifying and recovering all known optimal or state-of-the-art results for smooth and higher-order smooth settings—thereby establishing a paradigm-level unification.

📝 Abstract

We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a $(delta,epsilon)$-stationary point from $O(epsilon^{-4}delta^{-1})$ stochastic gradient queries to $O(epsilon^{-3}delta^{-1})$, which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of $O(epsilon^{-1.5}delta^{-0.5})$. Our techniques also recover all optimal or best-known results for finding $epsilon$ stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.

Problem

Research questions and friction points this paper is trying to address.

Optimizing non-smooth non-convex stochastic objectives efficiently

Reducing complexity for finding stationary points

Extending techniques to deterministic and second-order smooth cases

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online learning for non-smooth non-convex optimization

Reduction technique improves gradient query complexity

Advanced optimistic online learning for smooth objectives

🔎 Similar Papers

No similar papers found.