🤖 AI Summary
This paper addresses the challenge of causal inference in biobank studies where survival outcomes are right-censored and instrumental variables (IVs) are both numerous and weak, with a non-negligible fraction being invalid. To tackle this, we propose a semiparametric accelerated failure time (AFT) model framework. Methodologically, we integrate augmented inverse probability weighting (AIPW) moment conditions with heteroskedasticity-based identification, yielding the GEL-NOW estimator: it employs Neyman-orthogonalized neural networks to handle high-dimensional nuisance parameters, explicitly corrects asymptotic bias arising from non-orthogonality, and incorporates a censoring-adjusted overidentification test. We establish √n-consistency and asymptotic normality of the estimator under mild regularity conditions. Simulation studies and empirical analysis using UK Biobank data demonstrate that the method remains robust, efficient, and interpretable even under severe right-censoring, pervasive weak instruments, and substantial IV invalidity.
📝 Abstract
We propose a semiparametric framework for causal inference with right-censored survival outcomes and many weak invalid instruments, motivated by Mendelian randomization in biobank studies where classical methods may fail. We adopt an accelerated failure time model and construct a moment condition based on augmented inverse probability of censoring weighting, incorporating both uncensored and censored observations. Under a heteroscedasticity-based condition on the treatment model, we establish point identification of the causal effect despite censoring and invalid instruments. We propose GEL-NOW (Generalized Empirical Likelihood with Non-Orthogonal and Weak moments) for valid inference under these conditions. A divergent number of Neyman orthogonal nuisance functions is estimated using deep neural networks. A key challenge is that the conditional censoring distribution is a non-Neyman orthogonal nuisance, contributing to the first-order asymptotics of the estimator for the target causal effect parameter. We derive the asymptotic distribution and explicitly incorporate this additional uncertainty into the asymptotic variance formula. We also introduce a censoring-adjusted over-identification test that accounts for this variance component. Simulation studies and UK Biobank applications demonstrate the method's robustness and practical utility.