Online Learning in the Random Order Model

📅 2025-10-03

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This paper studies online learning under the random-order model, where non-stationarity arises from random permutation of loss sequences, and addresses the challenge of designing algorithms that are both robust and efficient. We propose a generic adaptive framework that extends classical stochastic online learning algorithms to this setting. Crucially, we establish—via rigorous analysis—that learnability in online classification is characterized by the VC dimension, not the Littlestone dimension, fundamentally distinguishing the random-order model from the adversarial one. Our framework unifies several practical constraints, including random-order modeling, delayed feedback, constrained learning, and switching costs, and provides a unified regret analysis. The results yield significantly improved regret upper bounds for delayed prediction, constrained online learning, and switching-cost bandits. Moreover, we provide the first necessary and sufficient condition for online learnability in classification under the random-order model.

Technology Category

Application Category

📝 Abstract

In the random-order model for online learning, the sequence of losses is chosen upfront by an adversary and presented to the learner after a random permutation. Any random-order input is emph{asymptotically} equivalent to a stochastic i.i.d. one, but, for finite times, it may exhibit significant {em non-stationarity}, which can hinder the performance of stochastic learning algorithms. While algorithms for adversarial inputs naturally maintain their regret guarantees in random order, simple no-regret algorithms exist for the stochastic model that fail against random-order instances. In this paper, we propose a general template to adapt stochastic learning algorithms to the random-order model without substantially affecting their regret guarantees. This allows us to recover improved regret bounds for prediction with delays, online learning with constraints, and bandits with switching costs. Finally, we investigate online classification and prove that, in random order, learnability is characterized by the VC dimension rather than the Littlestone dimension, thus providing a further separation from the general adversarial model.

Problem

Research questions and friction points this paper is trying to address.

Adapting stochastic algorithms to handle non-stationarity in random-order online learning

Improving regret bounds for delayed prediction, constrained learning, and bandits

Establishing VC dimension as the key learnability criterion in random-order classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts stochastic algorithms to random-order model

Maintains regret guarantees without substantial impact

Characterizes learnability via VC dimension not Littlestone

🔎 Similar Papers

No similar papers found.