🤖 AI Summary
This study addresses the coupled impact of carryover effects and reward autocorrelation in switchback experiments for A/B testing—a interaction overlooked in existing designs. We propose the first unified Markovian framework for joint modeling and analysis, revealing a phase-transition–like degradation in estimator accuracy induced by their coupling. Our method is estimator-agnostic and yields practical design principles: optimal switching-period determination, minimum-variance design selection, and an implementable operational workflow. Extensive simulations and deployment on industrial platforms demonstrate that our approach improves statistical power by 20–40% over conventional fixed-period or randomized switchback designs. The core contribution lies in establishing a novel analytical paradigm for carryover–autocorrelation coupling, and deriving actionable phase-transition boundaries and design guidelines deployable in real-world experimentation systems.
📝 Abstract
A/B testing has become the gold standard for policy evaluation in modern technological industries. Motivated by the widespread use of switchback experiments in A/B testing, this paper conducts a comprehensive comparative analysis of various switchback designs in Markovian environments. Unlike many existing works which derive the optimal design based on specific and relatively simple estimators, our analysis covers a range of state-of-the-art estimators developed in the reinforcement learning (RL) literature. It reveals that the effectiveness of different switchback designs depends crucially on (i) the size of the carryover effect and (ii) the auto-correlations among reward errors over time. Meanwhile, these findings are estimator-agnostic, i.e., they apply to most RL estimators. Based on these insights, we provide a workflow to offer guidelines for practitioners on designing switchback experiments in A/B testing.