🤖 AI Summary
Existing generic online learning methods achieve minimax-optimal regret bounds but lack problem-dependent adaptivity to gradient variation $V_T$, hindering fast convergence in stochastic optimization and game-theoretic settings. This paper proposes UniGrad—the first gradient-variation-adaptive framework for generic online convex optimization—requiring no prior knowledge of curvature and unifying treatment of strongly convex, exponentially concave, and general convex functions. Built upon a meta-algorithmic architecture with multiple base learners, correction mappings, and Bregman iterations, UniGrad is extended to UniGrad++, which uses only a single gradient query per round. For strongly convex and exponentially concave losses, UniGrad++ attains $O(log V_T)$ and $O(d log V_T)$ regret, respectively; for general convex losses, it achieves $O(sqrt{V_T log V_T})$ regret—further improvable to the optimal $O(sqrt{V_T})$. The framework thus combines theoretical optimality with practical efficiency.
📝 Abstract
Universal online learning aims to achieve optimal regret guarantees without requiring prior knowledge of the curvature of online functions. Existing methods have established minimax-optimal regret bounds for universal online learning, where a single algorithm can simultaneously attain $mathcal{O}(sqrt{T})$ regret for convex functions, $mathcal{O}(d log T)$ for exp-concave functions, and $mathcal{O}(log T)$ for strongly convex functions, where $T$ is the number of rounds and $d$ is the dimension of the feasible domain. However, these methods still lack problem-dependent adaptivity. In particular, no universal method provides regret bounds that scale with the gradient variation $V_T$, a key quantity that plays a crucial role in applications such as stochastic optimization and fast-rate convergence in games. In this work, we introduce UniGrad, a novel approach that achieves both universality and adaptivity, with two distinct realizations: UniGrad.Correct and UniGrad.Bregman. Both methods achieve universal regret guarantees that adapt to gradient variation, simultaneously attaining $mathcal{O}(log V_T)$ regret for strongly convex functions and $mathcal{O}(d log V_T)$ regret for exp-concave functions. For convex functions, the regret bounds differ: UniGrad.Correct achieves an $mathcal{O}(sqrt{V_T log V_T})$ bound while preserving the RVU property that is crucial for fast convergence in online games, whereas UniGrad.Bregman achieves the optimal $mathcal{O}(sqrt{V_T})$ regret bound through a novel design. Both methods employ a meta algorithm with $mathcal{O}(log T)$ base learners, which naturally requires $mathcal{O}(log T)$ gradient queries per round. To enhance computational efficiency, we introduce UniGrad++, which retains the regret while reducing the gradient query to just $1$ per round via surrogate optimization. We further provide various implications.