Strategic A/B testing via Maximum Probability-driven Two-armed Bandit

📅 2025-06-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In large-scale A/B testing, detecting small yet economically meaningful average treatment effects (ATEs) is hindered by low statistical power. To address this, we propose a maximum-probability-driven two-armed bandit testing framework that integrates counterfactual modeling, weighted volatility statistics, and permutation testing. Crucially, we introduce the “strategic central limit theorem,” which concentrates the test statistic’s distribution under the null hypothesis while dispersing it under the alternative—thereby substantially enhancing detection sensitivity without inflating Type I error. Empirical evaluations demonstrate that our method achieves high statistical power while significantly reducing required sample size and experimental duration, thus lowering operational costs. The framework is both theoretically rigorous—grounded in asymptotic theory and causal inference principles—and practically deployable in industrial settings.

Technology Category

Application Category

📝 Abstract
Detecting a minor average treatment effect is a major challenge in large-scale applications, where even minimal improvements can have a significant economic impact. Traditional methods, reliant on normal distribution-based or expanded statistics, often fail to identify such minor effects because of their inability to handle small discrepancies with sufficient sensitivity. This work leverages a counterfactual outcome framework and proposes a maximum probability-driven two-armed bandit (TAB) process by weighting the mean volatility statistic, which controls Type I error. The implementation of permutation methods further enhances the robustness and efficacy. The established strategic central limit theorem (SCLT) demonstrates that our approach yields a more concentrated distribution under the null hypothesis and a less concentrated one under the alternative hypothesis, greatly improving statistical power. The experimental results indicate a significant improvement in the A/B testing, highlighting the potential to reduce experimental costs while maintaining high statistical power.
Problem

Research questions and friction points this paper is trying to address.

Detecting minor average treatment effects in large-scale applications
Improving sensitivity to small discrepancies in A/B testing
Reducing experimental costs while maintaining high statistical power
Innovation

Methods, ideas, or system contributions that make the work stand out.

Maximum probability-driven two-armed bandit process
Weighted mean volatility statistic controls Type I error
Permutation methods enhance robustness and efficacy
🔎 Similar Papers
No similar papers found.
Y
Yu Zhang
Zhongtai Securities Institute for Financial Studies, Shandong University, Jinan, China
S
Shanshan Zhao
School of Mathematics, Shandong University, Jinan, China
B
Bokui Wan
Didi Chuxing, Beijing, China
J
Jinjuan Wang
School of Mathematics and Statistics, Beijing Institute of Technology, Beijing, China
Xiaodong Yan
Xiaodong Yan
Unknown affiliation
统计学,机器学习