SCaLE: Switching Cost aware Learning and Exploration

📅 2026-01-14

📈 Citations: 2

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This work addresses the challenge of unbounded metric complexity in online convex optimization with noisy feedback, where the objective involves jointly minimizing high-dimensional dynamic quadratic hitting costs and ℓ₂-norm switching costs. To tackle this problem under stochastic environments with unknown hitting cost structures, the authors propose the SCaLE algorithm, which achieves, for the first time, a sublinear dynamic regret guarantee for such joint settings. By introducing spectral regret analysis, SCaLE effectively disentangles regret contributions arising from eigenvalue estimation errors and perturbations in the eigenvector basis. Theoretical analysis establishes a distribution-free sublinear dynamic regret bound for SCaLE, and empirical evaluations demonstrate its superior performance and statistical consistency against multiple baselines.

Technology Category

Application Category

📝 Abstract

This work addresses the fundamental problem of unbounded metric movement costs in bandit online convex optimization, by considering high-dimensional dynamic quadratic hitting costs and $\ell_2$-norm switching costs in a noisy bandit feedback model. For a general class of stochastic environments, we provide the first algorithm SCaLE that provably achieves a distribution-agnostic sub-linear dynamic regret, without the knowledge of hitting cost structure. En-route, we present a novel spectral regret analysis that separately quantifies eigenvalue-error driven regret and eigenbasis-perturbation driven regret. Extensive numerical experiments, against online-learning baselines, corroborate our claims, and highlight statistical consistency of our algorithm.

Problem

Research questions and friction points this paper is trying to address.

switching cost

online convex optimization

bandit feedback

dynamic regret

metric movement cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

bandit online convex optimization

switching cost

dynamic regret