🤖 AI Summary
In online exp-concave optimization (OXO), standard algorithms such as Online Newton Step (ONS) achieve the optimal regret bound of $O(d log T)$, but incur $Omega(d^omega)$ operations per iteration (where $omega in (2,3]$) due to costly Mahalanobis projections, resulting in total runtime $ ilde{O}(d^omega T)$. Its stochastic counterpart (SXO) suffers from $ ilde{O}(d^{omega+1}/varepsilon)$ runtime, addressing an open problem raised by Koren (COLT’13). We propose LightONS—the first algorithm to integrate domain transformation and lazy updates from parameter-free learning into exp-concave optimization, deferring expensive projections and enabling adaptive gradient scaling. LightONS retains the optimal $O(d log T)$ regret while reducing total runtime to $O(d^2 T + d^omega sqrt{T log T})$. In the stochastic setting, it achieves $ ilde{O}(d^3/varepsilon)$ runtime—breaking prior lower bounds—and supports extensions including gradient-norm-adaptive regret and stochastic bandits.
📝 Abstract
Online eXp-concave Optimization (OXO) is a fundamental problem in online learning. The standard algorithm, Online Newton Step (ONS), balances statistical optimality and computational practicality, guaranteeing an optimal regret of $O(d log T)$, where $d$ is the dimension and $T$ is the time horizon. ONS faces a computational bottleneck due to the Mahalanobis projections at each round. This step costs $Ω(d^ω)$ arithmetic operations for bounded domains, even for the unit ball, where $ωin (2,3]$ is the matrix-multiplication exponent. As a result, the total runtime can reach $ ilde{O}(d^ωT)$, particularly when iterates frequently oscillate near the domain boundary. For Stochastic eXp-concave Optimization (SXO), computational cost is also a challenge. Deploying ONS with online-to-batch conversion for SXO requires $T = ilde{O}(d/ε)$ rounds to achieve an excess risk of $ε$, and thereby necessitates an $ ilde{O}(d^{ω+1}/ε)$ runtime. A COLT'13 open problem posed by Koren [2013] asks for an SXO algorithm with runtime less than $ ilde{O}(d^{ω+1}/ε)$.
This paper proposes a simple variant of ONS, LightONS, which reduces the total runtime to $O(d^2 T + d^ωsqrt{T log T})$ while preserving the optimal $O(d log T)$ regret. LightONS implies an SXO method with runtime $ ilde{O}(d^3/ε)$, thereby answering the open problem. Importantly, LightONS preserves the elegant structure of ONS by leveraging domain-conversion techniques from parameter-free online learning to introduce a hysteresis mechanism that delays expensive Mahalanobis projections until necessary. This design enables LightONS to serve as an efficient plug-in replacement of ONS in broader scenarios, even beyond regret minimization, including gradient-norm adaptive regret, parametric stochastic bandits, and memory-efficient online learning.