Fixed-Budget Change Point Identification in Piecewise Constant Bandits

📅 2025-01-22

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This paper addresses the precise localization of change-points—i.e., isolated discontinuities where reward means jump—in piecewise-constant multi-armed bandits, with action space [0,1] and a fixed total sampling budget. We propose the first algorithm adaptive to arbitrary budget sizes, combining confidence-interval-driven sequential sampling with a phased binary search framework. We establish, for the first time, a non-asymptotic tight lower bound on change-point localization error, revealing a fundamental separation in problem complexity between small- and large-budget regimes. Theoretically, our algorithm achieves optimal convergence rates in both regimes. Empirically, it significantly outperforms baseline methods across diverse noise settings, with localization error decaying exponentially in the budget.

Technology Category

Application Category

📝 Abstract

We study the piecewise constant bandit problem where the expected reward is a piecewise constant function with one change point (discontinuity) across the action space $[0,1]$ and the learner's aim is to locate the change point. Under the assumption of a fixed exploration budget, we provide the first non-asymptotic analysis of policies designed to locate abrupt changes in the mean reward function under bandit feedback. We study the problem under a large and small budget regime, and for both settings establish lower bounds on the error probability and provide algorithms with near matching upper bounds. Interestingly, our results show a separation in the complexity of the two regimes. We then propose a regime adaptive algorithm which is near optimal for both small and large budgets simultaneously. We complement our theoretical analysis with experimental results in simulated environments to support our findings.

Problem

Research questions and friction points this paper is trying to address.

Change-point detection

Constant gambling game

Reward mean shift

Innovation

Methods, ideas, or system contributions that make the work stand out.

Budget-constrained Change-point Detection

Partially Informed Multi-armed Bandits

Optimal Algorithm Design

🔎 Similar Papers

Enhancing Changepoint Detection: Penalty Learning through Deep Learning Techniques