Exploration in the Limit

📅 2025-12-31
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge in fixed-confidence best-arm identification, where existing methods often sacrifice either rigorous error control or sample efficiency due to reliance on loose tail bounds or strong parametric assumptions. The authors propose an asymptotic error control framework that constructs anytime-valid confidence sequences tailored for long-horizon experiments, enabling a nonparametric best-arm identification algorithm that leverages individual contextual information. By integrating nonparametric statistical inference with covariate-assisted variance reduction, the method achieves worst-case sample complexity comparable to that of optimal algorithms under Gaussian assumptions with known variance, all under mild regularity conditions. Empirical evaluations demonstrate a substantial reduction in average sample consumption while maintaining strict control over the error rate.

Technology Category

Application Category

📝 Abstract
In fixed-confidence best arm identification (BAI), the objective is to quickly identify the optimal option while controlling the probability of error below a desired threshold. Despite the plethora of BAI algorithms, existing methods typically fall short in practical settings, as stringent exact error control requires using loose tail inequalities and/or parametric restrictions. To overcome these limitations, we introduce a relaxed formulation that requires valid error control asymptotically with respect to a minimum sample size. This aligns with many real-world settings that often involve weak signals, high desired significance, and post-experiment inference requirements, all of which necessitate long horizons. This allows us to achieve tighter optimality, while better handling flexible nonparametric outcome distributions and fully leveraging individual-level contexts. We develop a novel asymptotic anytime-valid confidence sequences over arm indices, and we use it to design a new BAI algorithm for our asymptotic framework. Our method flexibly incorporates covariates for variance reduction and ensures approximate error control in fully nonparametric settings. Under mild convergence assumptions, we provide asymptotic bounds on the sample complexity and show the worst-case sample complexity of our approach matches the best-case sample complexity of Gaussian BAI under exact error guarantees and known variances. Experiments suggest our approach reduces average sample complexities while maintaining error control.
Problem

Research questions and friction points this paper is trying to address.

best arm identification
fixed-confidence
error control
nonparametric
sample complexity
Innovation

Methods, ideas, or system contributions that make the work stand out.

asymptotic confidence sequences
best arm identification
nonparametric BAI
covariate adjustment
sample complexity
🔎 Similar Papers
No similar papers found.