🤖 AI Summary
In data-sharing A/B testing, inter-algorithm data symbiosis induces symbiosis bias, distorting estimates of the global treatment effect (GTE) and undermining the reliability of algorithm ranking. Crucially, correct identification of the GTE sign—rather than estimation precision—is decisive for algorithm selection. Method: Building on a multi-armed bandit framework and integrating causal inference theory, we formally characterize the sufficient conditions for GTE sign consistency under data sharing. Results: We identify the exploration–exploitation trade-off as the key mechanism governing the direction of symbiosis bias and, for the first time, rigorously establish the validity boundary within which A/B experiments can reliably identify the superior algorithm in shared-data settings. This work provides a verifiable theoretical foundation and actionable design principles for online experimentation in data-sharing environments.
📝 Abstract
We study A/B experiments that are designed to compare the performance of two recommendation algorithms. Prior work has shown that the standard difference-in-means estimator is biased in estimating the global treatment effect (GTE) due to a particular form of interference between experimental units. Specifically, units under the treatment and control algorithms contribute to a shared pool of data that subsequently train both algorithms, resulting in interference between the two groups. The bias arising from this type of data sharing is known as "symbiosis bias". In this paper, we highlight that, for decision-making purposes, the sign of the GTE often matters more than its precise magnitude when selecting the better algorithm. We formalize this insight under a multi-armed bandit framework and theoretically characterize when the sign of the expected GTE estimate under data sharing aligns with or contradicts the sign of the true GTE. Our analysis identifies the level of exploration versus exploitation as a key determinant of how symbiosis bias impacts algorithm selection.