🤖 AI Summary
Causal inference under network interference faces key challenges in few-shot settings, including unverifiable counterfactual estimation and difficulty modeling spillover effects. To address these, we propose the first causal inference framework supporting cross-validation. Our method introduces: (1) a distribution-preserving network bootstrap that maintains both network topology and interference distribution; (2) joint modeling of unit-level heterogeneity and local interaction variability to enhance counterfactual accuracy; and (3) the first open-source causal evaluation toolbox featuring ground-truth benchmarks derived from real-world data. Theoretically, we establish non-asymptotic guarantees and extend causal message passing to accommodate interference. Empirically, we evaluate across diverse simulated and real-world scenarios—including AI agent collaboration, community opinion diffusion, and ride-hailing dispatch—demonstrating significant improvements in counterfactual prediction accuracy and cross-scenario generalization reliability.
📝 Abstract
In experimental settings with network interference, a unit's treatment can influence outcomes of other units, challenging both causal effect estimation and its validation. Classic validation approaches fail as outcomes are only observable under one treatment scenario and exhibit complex correlation patterns due to interference. To address these challenges, we introduce a new framework enabling cross-validation for counterfactual estimation. At its core is our distribution-preserving network bootstrap method -- a theoretically-grounded approach inspired by approximate message passing. This method creates multiple subpopulations while preserving the underlying distribution of network effects. We extend recent causal message-passing developments by incorporating heterogeneous unit-level characteristics and varying local interactions, ensuring reliable finite-sample performance through non-asymptotic analysis. We also develop and publicly release a comprehensive benchmark toolbox with diverse experimental environments, from networks of interacting AI agents to opinion formation in real-world communities and ride-sharing applications. These environments provide known ground truth values while maintaining realistic complexities, enabling systematic examination of causal inference methods. Extensive evaluation across these environments demonstrates our method's robustness to diverse forms of network interference. Our work provides researchers with both a practical estimation framework and a standardized platform for testing future methodological developments.