🤖 AI Summary
This study investigates the sensitivity of stochastic optimization algorithms to stepsize choices and the resulting performance degradation. Through theoretical analysis, it establishes a quantitative relationship between stepsize and convergence bounds, providing the first direct theoretical evidence for the robustness of adaptive stepsize methods—such as SPS and NGN—relative to standard SGD, thereby moving beyond prior reliance on empirical comparisons alone. Both theoretical derivations and numerical experiments consistently demonstrate that adaptive methods exhibit significantly greater stability with respect to stepsize selection and more controllable performance degradation. This robustness holds in both convex and non-convex settings, with the latter underscoring the continued theoretical relevance of adaptive strategies even in challenging non-convex optimization landscapes.
📝 Abstract
We present a theoretical analysis of stochastic optimization methods in terms of their sensitivity with respect to the step size. We identify a key quantity that, for each method, describes how the performance degrades as the step size becomes too large. For convex problems, we show that this quantity directly impacts the suboptimality bound of the method. Most importantly, our analysis provides direct theoretical evidence that adaptive step-size methods, such as SPS or NGN, are more robust than SGD. This allows us to quantify the advantage of these adaptive methods beyond empirical evaluation. Finally, we show through experiments that our theoretical bound qualitatively mirrors the actual performance as a function of the step size, even for nonconvex problems.