🤖 AI Summary
This paper addresses the low estimation efficiency in instrumental variable (IV) regression arising from nonsmooth structural functions, limited first-stage samples, and high-dimensional or complex features. To tackle these challenges, we propose DeepFIV—a two-stage nonparametric IV estimator based on deep neural networks. DeepFIV adaptively learns high-dimensional feature representations, thereby overcoming the convergence rate limitations of conventional kernel- and sieve-based methods under Besov-space heterogeneity. We establish that, under standard IV identification assumptions, DeepFIV achieves the minimax optimal convergence rate. Empirically, DeepFIV demonstrates superior robustness and data efficiency—particularly when the structural function is nonsmooth or first-stage sample size is scarce—outperforming existing nonparametric IV estimators.
📝 Abstract
We provide a convergence analysis of deep feature instrumental variable (DFIV) regression (Xu et al., 2021), a nonparametric approach to IV regression using data-adaptive features learned by deep neural networks in two stages. We prove that the DFIV algorithm achieves the minimax optimal learning rate when the target structural function lies in a Besov space. This is shown under standard nonparametric IV assumptions, and an additional smoothness assumption on the regularity of the conditional distribution of the covariate given the instrument, which controls the difficulty of Stage 1. We further demonstrate that DFIV, as a data-adaptive algorithm, is superior to fixed-feature (kernel or sieve) IV methods in two ways. First, when the target function possesses low spatial homogeneity (i.e., it has both smooth and spiky/discontinuous regions), DFIV still achieves the optimal rate, while fixed-feature methods are shown to be strictly suboptimal. Second, comparing with kernel-based two-stage regression estimators, DFIV is provably more data efficient in the Stage 1 samples.