🤖 AI Summary
In distributed healthcare settings, high heterogeneity across sites in covariate, treatment, and outcome distributions leads to substantial bias and low efficiency in federated causal inference.
Method: This paper introduces the first systematic taxonomy and framework for weighted- and optimization-based federated causal inference. It theoretically establishes FedProx regularization as superior to naive averaging and meta-analysis in bias–variance trade-off, and extends it—novelty—to federated survival analysis (Cox and Aalen–Johansen models). Integrating federated learning (FedProx, peer-to-peer communication, model decomposition), doubly robust causal estimation (IPW/AIPW), and asymptotic statistical theory, it derives the first theoretical bounds on bias and variance of federated causal estimators under heterogeneity.
Results: The analysis proves FedProx achieves near-optimal performance. An open-source, reproducible toolkit is released, alongside a roadmap for fair, trustworthy, and scalable federated causal inference.
📝 Abstract
Federated causal inference enables multi-site treatment effect estimation without sharing individual-level data, offering a privacy-preserving solution for real-world evidence generation. However, data heterogeneity across sites, manifested in differences in covariate, treatment, and outcome, poses significant challenges for unbiased and efficient estimation. In this paper, we present a comprehensive review and theoretical analysis of federated causal effect estimation across both binary/continuous and time-to-event outcomes. We classify existing methods into weight-based strategies and optimization-based frameworks and further discuss extensions including personalized models, peer-to-peer communication, and model decomposition. For time-to-event outcomes, we examine federated Cox and Aalen-Johansen models, deriving asymptotic bias and variance under heterogeneity. Our analysis reveals that FedProx-style regularization achieves near-optimal bias-variance trade-offs compared to naive averaging and meta-analysis. We review related software tools and conclude by outlining opportunities, challenges, and future directions for scalable, fair, and trustworthy federated causal inference in distributed healthcare systems.