🤖 AI Summary
This paper addresses online learning and offline strongly convex optimization under Hölder smoothness, where the smoothness parameters are unknown. To overcome this challenge, we propose the first fully adaptive online algorithm: it employs a gradient-variation-based detection-and-adjustment mechanism that requires no prior knowledge of the Hölder exponent or smoothness constant; and integrates a guess-and-verify framework with online-to-batch conversion to uniformly handle both smooth and nonsmooth convex functions. Theoretically, our method achieves the optimal regret bound in online learning. For offline strongly convex optimization, it is the first to attain accelerated convergence—$O(1/k^2)$—without assuming known smoothness, while retaining near-optimal $O(1/k)$ convergence in the nonsmooth case. The algorithm thus offers both universality across function classes and robustness to unknown problem parameters.
📝 Abstract
Smoothness is known to be crucial for acceleration in offline optimization, and for gradient-variation regret minimization in online learning. Interestingly, these two problems are actually closely connected -- accelerated optimization can be understood through the lens of gradient-variation online learning. In this paper, we investigate online learning with H""older smooth functions, a general class encompassing both smooth and non-smooth (Lipschitz) functions, and explore its implications for offline optimization. For (strongly) convex online functions, we design the corresponding gradient-variation online learning algorithm whose regret smoothly interpolates between the optimal guarantees in smooth and non-smooth regimes. Notably, our algorithms do not require prior knowledge of the H""older smoothness parameter, exhibiting strong adaptivity over existing methods. Through online-to-batch conversion, this gradient-variation online adaptivity yields an optimal universal method for stochastic convex optimization under H""older smoothness. However, achieving universality in offline strongly convex optimization is more challenging. We address this by integrating online adaptivity with a detection-based guess-and-check procedure, which, for the first time, yields a universal offline method that achieves accelerated convergence in the smooth regime while maintaining near-optimal convergence in the non-smooth one.