Valid post-selection inference in Robust Q-learning

📅 2022-08-05

📈 Citations: 2

✨ Influential: 0

career value

203K/year

🤖 AI Summary

In high-dimensional personalized treatment strategy estimation, standard post-variable-selection statistical inference fails due to selection-induced bias. Method: This paper introduces Universal Post-Selection Inference (UPoSI) into the robust Q-learning framework, proposing a selection-mechanism-agnostic universal post-selection inference method. The approach uniformly improves confidence interval construction and is theoretically shown to be asymptotically valid in multi-stage decision settings, guaranteeing nominal Type-I error control for hypothesis testing and exact coverage probability for confidence intervals. Results: Monte Carlo simulations demonstrate that the proposed method substantially improves coverage accuracy and statistical power compared to selective inference, while remaining compatible with diverse data-driven variable selection procedures. It thus provides a generalizable and verifiable foundation for statistical inference in robust Q-learning.

📝 Abstract

Constructing an optimal adaptive treatment strategy becomes complex when there are a large number of potential tailoring variables. In such scenarios, many of these extraneous variables may contribute little or no benefit to an adaptive strategy while increasing implementation costs and putting an undue burden on patients. Although existing methods allow selection of the informative prognostic factors, statistical inference is complicated by the data-driven selection process. To remedy this deficiency, we adapt the Universal Post-Selection Inference procedure to the semiparametric Robust Q-learning method and the unique challenges encountered in such multistage decision methods. In the process, we also identify a uniform improvement to confidence intervals constructed in this post-selection inference framework. Under certain rate assumptions, we provide theoretical results that demonstrate the validity of confidence regions and tests constructed from our proposed procedure. The performance of our method is compared to the Selective Inference framework through simulation studies, demonstrating the strengths of our procedure and its applicability to multiple selection mechanisms. 1 ar X iv :2 20 8. 03 23 3v 1 [ st at .M E ] 5 A ug 2 02 2

Problem

Research questions and friction points this paper is trying to address.

Addresses variable selection in robust Q-learning for treatment strategies

Ensures valid statistical inference after data-driven variable selection

Handles confounding and extraneous variables in adaptive treatment optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts UPoSI to Robust Q-learning

Addresses variable selection in multistage decisions

Ensures valid inference post-selection

🔎 Similar Papers

No similar papers found.