🤖 AI Summary
Cluster randomized trials (CRTs) often suffer from concurrent missingness in individual- and cluster-level outcomes, baseline covariates, and cluster population sizes—posing substantial challenges for valid causal inference.
Method: We propose the first unified, doubly robust estimation framework for CRTs, integrating inverse probability weighting and regression adjustment. The approach leverages machine learning–assisted modeling and nonparametric specification of missingness mechanisms, enabling consistent and efficient estimation of causal effects under arbitrary patterns of missing data.
Contribution/Results: We develop a novel sensitivity analysis framework tailored to CRTs and an optimal weighting strategy, with uniform sampling ensuring robust inference. Simulation studies and real-data analyses demonstrate that our method substantially improves estimation accuracy and robustness—particularly under complex, multiple-missingness scenarios—outperforming existing approaches.
📝 Abstract
In cluster-randomized trials (CRTs), missing data can occur in various ways, including missing values in outcomes and baseline covariates at the individual or cluster level, or completely missing information for non-participants. Among the various types of missing data in CRTs, missing outcomes have attracted the most attention. However, no existing methods simultaneously address all aforementioned types of missing data in CRTs. To fill in this gap, we propose a doubly-robust estimator for the average treatment effect on a variety of effect measure scales. The proposed estimator simultaneously handles missing outcomes under missingness at random, missing covariates without constraining the missingness mechanism, and missing cluster-population sizes via a uniform sampling mechanism. Furthermore, we detail key considerations to improve precision by specifying the optimal weights, leveraging machine learning, and modeling the treatment assignment mechanism. Finally, to evaluate the impact of violating missing data assumptions, we contribute a new sensitivity analysis framework tailored to CRTs. We assess the performance of the proposed methods through simulations and illustrate their use in a real data application.