🤖 AI Summary
Online platform experiments frequently violate the Stable Unit Treatment Value Assumption (SUTVA) due to network interference, rendering conventional estimators—such as the simple difference-in-means (DM) or the unbiased but high-variance Horvitz–Thompson (HT) estimator—unable to simultaneously control bias and variance. To address this, we propose the Neighbor Difference (DN) estimator, which explicitly models neighborhood-level interference under clustered randomization. DN achieves bias that is second-order in the interference strength and variance exponentially smaller than HT’s, substantially improving the bias–variance trade-off. Integrating graph-structured priors, neighborhood-difference modeling, and rigorous asymptotic theory, DN operates without strong assumptions on interference magnitude or structure. Extensive evaluations on large-scale social network and city-wide ride-hailing simulations demonstrate that DN consistently outperforms both DM and HT in estimation accuracy and stability. Our work establishes a new paradigm for causal inference under network interference.
📝 Abstract
Experiments in online platforms frequently suffer from network interference, in which a treatment applied to a given unit affects outcomes for other units connected via the platform. This SUTVA violation biases naive approaches to experiment design and estimation. A common solution is to reduce interference by clustering connected units, and randomizing treatments at the cluster level, typically followed by estimation using one of two extremes: either a simple difference-in-means (DM) estimator, which ignores remaining interference; or an unbiased Horvitz-Thompson (HT) estimator, which eliminates interference at great cost in variance. Even combined with clustered designs, this presents a limited set of achievable bias variance tradeoffs. We propose a new estimator, dubbed Differences-in-Neighbors (DN), designed explicitly to mitigate network interference. Compared to DM estimators, DN achieves bias second order in the magnitude of the interference effect, while its variance is exponentially smaller than that of HT estimators. When combined with clustered designs, DN offers improved bias-variance tradeoffs not achievable by existing approaches. Empirical evaluations on a large-scale social network and a city-level ride-sharing simulator demonstrate the superior performance of DN in experiments at practical scale.