On the problem of Best Arm Retention

📅 2025-04-16

🏛️ Theoretical Computer Science

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper introduces “Optimal Arm Retention” (OAR), a novel pure-exploration problem in stochastic multi-armed bandits: retaining a subset of $m$ arms—out of $n$—that provably contains the globally optimal arm, under budget or dynamic constraints. We formalize OAR as an independent learning objective and propose a unified framework jointly optimizing retention confidence and decision-switching cost. Our method integrates UCB-style upper confidence bounds, Bayesian posterior updates, and sequential significance testing, with theoretical guarantees on regret. Empirical evaluation on synthetic benchmarks and real-world delayed-feedback settings demonstrates that our approach achieves a 37% improvement in retention accuracy and reduces decision switches by 52% compared to classical Best Arm Identification algorithms, significantly enhancing online adaptability and deployment efficiency.