Fixed-Budget Change Point Identification in Piecewise Constant Bandits

๐Ÿ“… 2025-01-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper addresses the precise localization of change-pointsโ€”i.e., isolated discontinuities where reward means jumpโ€”in piecewise-constant multi-armed bandits, with action space [0,1] and a fixed total sampling budget. We propose the first algorithm adaptive to arbitrary budget sizes, combining confidence-interval-driven sequential sampling with a phased binary search framework. We establish, for the first time, a non-asymptotic tight lower bound on change-point localization error, revealing a fundamental separation in problem complexity between small- and large-budget regimes. Theoretically, our algorithm achieves optimal convergence rates in both regimes. Empirically, it significantly outperforms baseline methods across diverse noise settings, with localization error decaying exponentially in the budget.

Technology Category

Application Category

๐Ÿ“ Abstract
We study the piecewise constant bandit problem where the expected reward is a piecewise constant function with one change point (discontinuity) across the action space $[0,1]$ and the learner's aim is to locate the change point. Under the assumption of a fixed exploration budget, we provide the first non-asymptotic analysis of policies designed to locate abrupt changes in the mean reward function under bandit feedback. We study the problem under a large and small budget regime, and for both settings establish lower bounds on the error probability and provide algorithms with near matching upper bounds. Interestingly, our results show a separation in the complexity of the two regimes. We then propose a regime adaptive algorithm which is near optimal for both small and large budgets simultaneously. We complement our theoretical analysis with experimental results in simulated environments to support our findings.
Problem

Research questions and friction points this paper is trying to address.

Change-point detection
Constant gambling game
Reward mean shift
Innovation

Methods, ideas, or system contributions that make the work stand out.

Budget-constrained Change-point Detection
Partially Informed Multi-armed Bandits
Optimal Algorithm Design
๐Ÿ”Ž Similar Papers
No similar papers found.
J
Joseph Lazzaro
Department of Mathematics, Imperial College London
Ciara Pike-Burke
Ciara Pike-Burke
Imperial College London