π€ AI Summary
This study addresses the challenge that conventional cluster-based randomized designs can substantially inflate the mean squared error (MSE) of global average treatment effect (GATE) estimation in social networks characterized by interference and homophily. The authors propose a unified framework that jointly models interference, homophily, and treatment effect heterogeneity within the potential outcomes paradigm. By optimizing the covariance structure of treatment assignments to minimize worst-case MSE, they develop a robust experimental design. Two complementary algorithms are introduced: one based on semidefinite programming (SDP) with Gaussian rounding, and another an enhanced GramβSchmidt Walk vector balancing procedure. Empirical evaluations on both synthetic and real-world village network data demonstrate that the proposed approach significantly reduces GATE estimation error compared to existing methods.
π Abstract
To minimize the mean squared error (MSE) in global average treatment effect (GATE) estimation under network interference, a popular approach is to use a cluster-randomized design. However, in the presence of homophily, which is common in social networks, cluster randomization can instead increase the MSE. We develop a novel potential outcomes model that accounts for interference, homophily, and heterogeneous variation. In this setting, we establish a framework for optimizing designs for worst-case MSE under the Horvitz-Thompson estimator. This leads to an optimization problem over the covariance matrices of the treatment assignment, trading off interference, homophily, and robustness. We frame and solve this problem using two complementary approaches. The first involves formulating a semidefinite program (SDP) and employing Gaussian rounding, in the spirit of the Goemans-Williamson approximation algorithm for MAXCUT. The second is an adaptation of the Gram-Schmidt Walk, a vector-balancing algorithm which has recently received much attention. Finally, we evaluate the performance of our designs through various experiments on simulated network data and a real village network dataset.