🤖 AI Summary
Existing continuous instrumental variable (IV) causal estimation methods either rely on strong parametric assumptions or suffer from nonparametric inefficiency in moderate-to-high-dimensional covariate settings, and lack a unified framework for handling both continuous and categorical IVs.
Method: We propose a novel identification framework based on the Conditional Weighted Average Derivative Effect (CWADE), leveraging the conditional Riesz representation theorem to unify structural characterization of both IV types. We construct a second-order parametric submodel to capture nontrivial tangent spaces, enabling locally efficient, triply robust inference. The approach integrates semiparametric modeling, double machine learning, and bounded influence functions.
Contribution/Results: The resulting estimator is computationally feasible and theoretically robust. Simulation studies and real-data analysis—applying the method to a lung cancer clinical study—demonstrate accurate identification of the causal effect of obesity on two-year mortality among non-small cell lung cancer patients. Our method substantially improves flexibility and reliability in average treatment effect estimation under continuous IVs.
📝 Abstract
Instrumental variables (IVs) are often continuous, arising in diverse fields such as economics, epidemiology, and the social sciences. Existing approaches for continuous IVs typically impose strong parametric models or assume homogeneous treatment effects, while fully nonparametric methods may perform poorly in moderate- to high-dimensional covariate settings. We propose a new framework for identifying the average treatment effect with continuous IVs via conditional weighted average derivative effects. Using a conditional Riesz representer, our framework unifies continuous and categorical IVs. In this framework, the average treatment effect is typically overidentified, leading to a semiparametric observed-data model with a nontrivial tangent space. Characterizing this tangent space involves a delicate construction of a second-order parametric submodel, which, to the best of our knowledge, has not been standard practice in this literature. For estimation, building on an influence function in the semiparametric model that is also locally efficient within a submodel, we develop a locally efficient, triply robust, bounded, and easy-to-implement estimator. We apply our methods to an observational clinical study from the Princess Margaret Cancer Centre to examine the so-called obesity paradox in oncology, assessing the causal effect of excess body weight on two-year mortality among patients with non-small cell lung cancer.