Fixing the Loose Brake: Exponential-Tailed Stopping Time in Best Arm Identification

๐Ÿ“… 2024-11-04
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Best Arm Identification (BAI) in the fixed-confidence setting aims to identify the optimal arm with minimal sample complexity. However, existing algorithms only provide high-probability or expectation-based stopping time boundsโ€”suffering from heavy-tailed stopping distributions, and even admitting a non-zero probability of never terminating. This work identifies this fundamental limitation for the first time. We propose two novel algorithmic frameworks that guarantee strictly exponential-tailed stopping times: (i) a constructive algorithm combining Sequential Halving with doubling techniques; and (ii) a general meta-algorithmic framework that upgrades any BAI algorithm with high-probability stopping guarantees into one with exponential-tail guarantees. Theoretically, our algorithms satisfy (Pr( au > t) leq C exp(-ct)) for constants (C,c>0), markedly improving upon the polynomial tail bound established by Kalyanakrishnan et al. (2012). This yields significantly more robust and predictable sampling complexity for online decision-making.

Technology Category

Application Category

๐Ÿ“ Abstract
The best arm identification problem requires identifying the best alternative (i.e., arm) in active experimentation using the smallest number of experiments (i.e., arm pulls), which is crucial for cost-efficient and timely decision-making processes. In the fixed confidence setting, an algorithm must stop data-dependently and return the estimated best arm with a correctness guarantee. Since this stopping time is random, we desire its distribution to have light tails. Unfortunately, many existing studies focus on high probability or in expectation bounds on the stopping time, which allow heavy tails and, for high probability bounds, even not stopping at all. We first prove that this never-stopping event can indeed happen for some popular algorithms. Motivated by this, we propose algorithms that provably enjoy an exponential-tailed stopping time, which improves upon the polynomial tail bound reported by Kalyanakrishnan et al. (2012). The first algorithm is based on a fixed budget algorithm called Sequential Halving along with a doubling trick. The second algorithm is a meta algorithm that takes in any fixed confidence algorithm with a high probability stopping guarantee and turns it into one that enjoys an exponential-tailed stopping time. Our results imply that there is much more to be desired for contemporary fixed confidence algorithms.
Problem

Research questions and friction points this paper is trying to address.

Identify best arm with minimal experiments efficiently
Ensure stopping time has light exponential tails
Improve existing algorithms with polynomial tail bounds
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exponential-tailed stopping time algorithm
Sequential Halving with doubling trick
Meta algorithm for fixed confidence
๐Ÿ”Ž Similar Papers
No similar papers found.