Handling incomplete outcomes and covariates in cluster-randomized trials: doubly-robust estimation, efficiency considerations, and sensitivity analysis

📅 2024-01-20

📈 Citations: 3

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Cluster randomized trials (CRTs) often suffer from concurrent missingness in individual- and cluster-level outcomes, baseline covariates, and cluster population sizes—posing substantial challenges for valid causal inference. Method: We propose the first unified, doubly robust estimation framework for CRTs, integrating inverse probability weighting and regression adjustment. The approach leverages machine learning–assisted modeling and nonparametric specification of missingness mechanisms, enabling consistent and efficient estimation of causal effects under arbitrary patterns of missing data. Contribution/Results: We develop a novel sensitivity analysis framework tailored to CRTs and an optimal weighting strategy, with uniform sampling ensuring robust inference. Simulation studies and real-data analyses demonstrate that our method substantially improves estimation accuracy and robustness—particularly under complex, multiple-missingness scenarios—outperforming existing approaches.

Technology Category

Application Category

📝 Abstract

In cluster-randomized trials (CRTs), missing data can occur in various ways, including missing values in outcomes and baseline covariates at the individual or cluster level, or completely missing information for non-participants. Among the various types of missing data in CRTs, missing outcomes have attracted the most attention. However, no existing methods simultaneously address all aforementioned types of missing data in CRTs. To fill in this gap, we propose a doubly-robust estimator for the average treatment effect on a variety of effect measure scales. The proposed estimator simultaneously handles missing outcomes under missingness at random, missing covariates without constraining the missingness mechanism, and missing cluster-population sizes via a uniform sampling mechanism. Furthermore, we detail key considerations to improve precision by specifying the optimal weights, leveraging machine learning, and modeling the treatment assignment mechanism. Finally, to evaluate the impact of violating missing data assumptions, we contribute a new sensitivity analysis framework tailored to CRTs. We assess the performance of the proposed methods through simulations and illustrate their use in a real data application.

Problem

Research questions and friction points this paper is trying to address.

Addresses multiple types of missing data in cluster-randomized trials simultaneously

Proposes doubly-robust estimation for treatment effects under missing data

Develops sensitivity analysis framework for evaluating missing data assumptions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Doubly-robust estimator for cluster-randomized trials

Handles missing outcomes and covariates simultaneously

Sensitivity analysis framework for missing data assumptions

🔎 Similar Papers

Analysis of Two-Stage Rollout Designs with Clustering for Causal Inference under Network Interference