🤖 AI Summary
This paper addresses bias and invalid inference in average treatment effect (ATE) estimation under covariate-adaptive randomization when covariates are missing. We systematically investigate the large-sample properties of regression-adjusted estimators under both stratified and general covariate-adaptive designs. For the first time, we rigorously establish their asymptotic normality under missing-covariate conditions and propose a model-agnostic, consistent variance estimator to ensure robust statistical inference. Integrating inverse-probability weighting and multiple imputation into a linear regression adjustment framework, we construct an efficient and consistent ATE estimator. Numerical experiments demonstrate substantial improvements in finite-sample estimation accuracy and confidence interval coverage. Our work provides both theoretical guarantees and practical guidance for causal inference in settings with missing covariates under adaptive randomization.
📝 Abstract
Covariate-adaptive randomization is widely used in clinical trials to balance prognostic factors, and regression adjustments are often adopted to further enhance the estimation and inference efficiency. In practice, the covariates may contain missing values. Various methods have been proposed to handle the covariate missing problem under simple randomization. However, the statistical properties of the resulting average treatment effect estimators under stratified randomization, or more generally, covariate-adaptive randomization, remain unclear. To address this issue, we investigate the asymptotic properties of several average treatment effect estimators obtained by combining commonly used missingness processing procedures and regression adjustment methods. Moreover, we derive consistent variance estimators to enable valid inferences. Finally, we conduct a numerical study to evaluate the finite-sample performance of the considered estimators under various sample sizes and numbers of covariates and provide recommendations accordingly. Our analysis is model-free, meaning that the conclusions remain asymptotically valid even in cases of misspecification of the regression model.