๐ค AI Summary
We address the challenge of causal effect estimation under high-dimensional instrumental variables (IVs) and high-dimensional (non-sparse) confounders. We propose a two-stage IV estimation framework that combines ridge regularization with sample splitting. Unlike existing Lasso-based approaches, our method imposes no sparsity assumptions on either the IVs or the confounders, yet delivers an unbiased, asymptotically normal estimator for the treatment effect and a consistent variance estimator. Theoretically, the framework is broadly applicableโits validity relies only on weak moment conditions and bounded eigenvalues, not structural sparsity. In simulations, it significantly outperforms state-of-the-art methods under both sparse and non-sparse designs. Empirically, applied to the returns to education, it robustly identifies a positive causal effect of schooling on earnings in a real-world setting with high-dimensional confounding, demonstrating strong finite-sample performance and stability.
๐ Abstract
Obtaining valid treatment effect inferences remains a challenging problem when dealing with numerous instruments and non-sparse control variables. In this paper, we propose a novel ridge regularization-based instrumental variables method for estimation and inference in the presence of both high-dimensional instrumental variables and high-dimensional control variables. These methods are applicable both with and without sparsity assumptions. To address the bias caused by high-dimensional instruments, we introduce a two-step procedure incorporating a data-splitting strategy. We establish statistical properties of the estimator, including consistency and asymptotic normality. Furthermore, we develop statistical inference procedures by providing a consistent estimator for the asymptotic variance of the estimator. The finite sample performance of the proposed method is evaluated through numerical simulations. Results indicate that the new estimator consistently outperforms existing sparsity-based approaches across various settings, offering valuable insights for more complex scenarios. Finally, we provide an empirical application estimating the causal effect of schooling on earnings by addressing potential endogeneity through the use of high-dimensional instrumental variables and high-dimensional covariates.