🤖 AI Summary
This study addresses the challenge of learning optimal dynamic treatment regimes (DTRs) when the no-unmeasured-confounding assumption fails in observational data. We introduce a proximal causal inference framework to overcome classical identifiability limitations. Three nonparametric identification strategies are proposed, establishing—for the first time—the semiparametric efficiency bound for DTR estimation. We develop a $(K+1)$-multiply robust learning algorithm that ensures both model misspecification robustness and statistical efficiency. Additionally, we derive identifiability conditions and estimation procedures for counterfactual means under static policies. Theoretically, our method achieves multiple robustness and asymptotic efficiency. Numerical experiments demonstrate substantial improvements over existing approaches in high-dimensional settings, under time-varying confounding, and in the presence of measurement error. The methodology is directly applicable to high-stakes causal decision-making domains such as precision medicine and personalized recommendation systems.
📝 Abstract
Dynamic treatment regimes are sequential decision rules that adapt treatment according to individual time-varying characteristics and outcomes to achieve optimal effects, with applications in precision medicine, personalized recommendations, and dynamic marketing. Estimating optimal dynamic treatment regimes via sequential randomized trials might face costly and ethical hurdles, often necessitating the use of historical observational data. In this work, we utilize proximal causal inference framework for learning optimal dynamic treatment regimes when the unconfoundedness assumption fails. Our contributions are four-fold: (i) we propose three nonparametric identification methods for optimal dynamic treatment regimes; (ii) we establish the semiparametric efficiency bound for the value function of a given regime; (iii) we propose a (K+1)-robust method for learning optimal dynamic treatment regimes, where K is the number of stages; (iv) as a by-product for marginal structural models, we establish identification and estimation of counterfactual means under a static regime. Numerical experiments validate the efficiency and multiple robustness of our proposed methods.