🤖 AI Summary
This study addresses treatment effect spillovers in social product A/B testing, where user interactions often contaminate control groups and bias causal estimates. To mitigate this, the authors propose a two-stage framework: in the pre-experiment phase, they employ a balanced Louvain algorithm on the social graph to generate clusters with minimal cross-cluster edges and balanced sizes, enabling cluster-level randomization that suppresses spillovers; in the post-experiment phase, they introduce a covariate-adjusted CUPAC estimator to reduce variance induced by cluster assignment and enhance statistical power. This work is the first to integrate spillover-aware cluster design with covariate-adjusted analysis, achieving unbiased estimation while substantially improving efficiency. Empirical validation on large-scale social sharing experiments at Kuaishou demonstrates that the framework effectively alleviates spillover effects and yields more accurate policy evaluations than conventional user-level randomization.
📝 Abstract
A/B testing is the foundation of decision-making in online platforms, yet social products often suffer from network interference: user interactions cause treatment effects to spill over into the control group. Such spillovers bias causal estimates and undermine experimental conclusions. Existing approaches face key limitations: user-level randomization ignores network structure, while cluster-based methods often rely on general-purpose clustering that is not tailored for spillover containment and has difficulty balancing unbiasedness and statistical power at scale. We propose a spillover-contained experimentation framework with two stages. In the pre-experiment stage, we build social interaction graphs and introduce a Balanced Louvain algorithm that produces stable, size-balanced clusters while minimizing cross-cluster edges, enabling reliable cluster-based randomization. In the post-experiment stage, we develop a tailored CUPAC estimator that leverages pre-experiment behavioral covariates to reduce the variance induced by cluster-level assignment, thereby improving statistical power. Together, these components provide both structural spillover containment and robust statistical inference. We validate our approach through large-scale social sharing experiments on Kuaishou, a platform serving hundreds of millions of users. Results show that our method substantially reduces spillover and yields more accurate assessments of social strategies than traditional user-level designs, establishing a reliable and scalable framework for networked A/B testing.