On the necessity of adaptive regularisation:Optimal anytime online learning on $oldsymbol{ell_p}$-balls

📅 2025-06-24

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work studies online convex optimization over ℓₚ-balls for p > 2, focusing on whether fixed, non-adaptive regularization enables Follow-The-Regularized-Leader (FTRL) to achieve dimension- and time-uniformly optimal regret in both high- and low-dimensional regimes. We show theoretically that fixed regularization cannot simultaneously adapt to all dimension–time scaling relationships, thus failing to attain asymptotically optimal regret; separable regularizers exhibit inherent limitations. To overcome this, we propose a novel regularization scheme that adapts jointly to time horizon and ambient dimension, yielding an FTRL variant that achieves the optimal regret bound uniformly across all dimensions and time steps. As a corollary, we establish the first negative lower bound for high-dimensional linear bandits, proving that fixed regularization cannot yield optimal regret rates. Our results systematically characterize the necessity and fundamental limits of regularization adaptivity in high-dimensional online learning.

Technology Category

Application Category

📝 Abstract

We study online convex optimization on $ell_p$-balls in $mathbb{R}^d$ for $p > 2$. While always sub-linear, the optimal regret exhibits a shift between the high-dimensional setting ($d > T$), when the dimension $d$ is greater than the time horizon $T$ and the low-dimensional setting ($d leq T$). We show that Follow-the-Regularised-Leader (FTRL) with time-varying regularisation which is adaptive to the dimension regime is anytime optimal for all dimension regimes. Motivated by this, we ask whether it is possible to obtain anytime optimality of FTRL with fixed non-adaptive regularisation. Our main result establishes that for separable regularisers, adaptivity in the regulariser is necessary, and that any fixed regulariser will be sub-optimal in one of the two dimension regimes. Finally, we provide lower bounds which rule out sub-linear regret bounds for the linear bandit problem in sufficiently high-dimension for all $ell_p$-balls with $p geq 1$.

Problem

Research questions and friction points this paper is trying to address.

Optimal online learning on l_p-balls for p > 2

Adaptive regularization necessity in varying dimensions

Lower bounds for linear bandit in high-dimension

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive regularisation for optimal online learning

Time-varying FTRL for dimension adaptability

Lower bounds for high-dimensional regret

🔎 Similar Papers

Adaptive debiased SGD in high-dimensional GLMs with streaming data