🤖 AI Summary
This work addresses the challenge of unbounded metric complexity in online convex optimization with noisy feedback, where the objective involves jointly minimizing high-dimensional dynamic quadratic hitting costs and ℓ₂-norm switching costs. To tackle this problem under stochastic environments with unknown hitting cost structures, the authors propose the SCaLE algorithm, which achieves, for the first time, a sublinear dynamic regret guarantee for such joint settings. By introducing spectral regret analysis, SCaLE effectively disentangles regret contributions arising from eigenvalue estimation errors and perturbations in the eigenvector basis. Theoretical analysis establishes a distribution-free sublinear dynamic regret bound for SCaLE, and empirical evaluations demonstrate its superior performance and statistical consistency against multiple baselines.
📝 Abstract
This work addresses the fundamental problem of unbounded metric movement costs in bandit online convex optimization, by considering high-dimensional dynamic quadratic hitting costs and $\ell_2$-norm switching costs in a noisy bandit feedback model. For a general class of stochastic environments, we provide the first algorithm SCaLE that provably achieves a distribution-agnostic sub-linear dynamic regret, without the knowledge of hitting cost structure. En-route, we present a novel spectral regret analysis that separately quantifies eigenvalue-error driven regret and eigenbasis-perturbation driven regret. Extensive numerical experiments, against online-learning baselines, corroborate our claims, and highlight statistical consistency of our algorithm.