🤖 AI Summary
This paper addresses the inconsistency of two-stage least squares (TSLS) estimation arising from the joint presence of endogenous regressors and conditional heteroskedasticity dependent on those endogenous variables—termed “endogenous heteroskedasticity”—in linear models. We first systematically characterize the failure mechanism of TSLS under this setting. To resolve it, we propose a control-function-based identification condition and a consistent estimation framework, establishing asymptotic normality and enabling valid statistical inference. We prove that the proposed estimator is both consistent and asymptotically efficient. Monte Carlo simulations demonstrate its substantial finite-sample superiority over TSLS. Reapplying the method to evaluate the causal effect of job training yields corrected estimates that mitigate prior bias. The core contribution is the development of the first theoretically grounded causal identification and estimation framework capable of accommodating endogenous heteroskedasticity.
📝 Abstract
Linear regressions with endogeneity are widely used to estimate causal effects. This paper studies a framework that has two common issues, endogeneity of the regressors, and heteroskedasticity that is allowed to depend on endogenous regressors, i.e., endogenous heteroskedasticity. We show that the presence of such conditional heteroskedasticity in the structural regression renders the two-stages least squares estimator inconsistent. To solve this issue, we propose sufficient conditions together with a control function approach to identify and estimate the causal parameters of interest. We establish the limiting properties of the estimator, say consistency and asymptotic normality, and propose inference procedures. Monte Carlo simulations provide evidence of the finite sample performance of the proposed methods, and evaluate different implementation procedures. We revisit an empirical application about job training to illustrate the methods.