🤖 AI Summary
This paper addresses two critical limitations of Conditional Average Treatment Effect (CATE) estimators in personalized decision-making: (1) lack of theoretical guarantees on decision-theoretic performance, and (2) misalignment between estimation accuracy and downstream decision efficacy. We propose the first decision-risk-minimization–driven two-stage learning framework for CATE estimation. We theoretically establish the inherent suboptimality of standard two-stage estimators (e.g., DR-learner) under decision risk. To overcome this, we design a dual-objective learning criterion that jointly optimizes local CATE estimation accuracy and sensitivity to the decision boundary, and introduce an adaptive smoothing neural optimization method enabling differentiable, end-to-end, decision-aware training. We prove that our estimator strictly dominates conventional CATE estimators in terms of decision risk. Empirical evaluation across multiple synthetic benchmarks and real-world healthcare datasets—including ICU treatment allocation and oncology drug selection—demonstrates significant improvements in individualized decision accuracy. Our work establishes a new paradigm for trustworthy deployment of causal machine learning in high-stakes decision settings.
📝 Abstract
Decision-making across various fields, such as medicine, heavily relies on conditional average treatment effects (CATEs). Practitioners commonly make decisions by checking whether the estimated CATE is positive, even though the decision-making performance of modern CATE estimators is poorly understood from a theoretical perspective. In this paper, we study optimal decision-making based on two-stage CATE estimators (e.g., DR-learner), which are considered state-of-the-art and widely used in practice. We prove that, while such estimators may be optimal for estimating CATE, they can be suboptimal when used for decision-making. Intuitively, this occurs because such estimators prioritize CATE accuracy in regions far away from the decision boundary, which is ultimately irrelevant to decision-making. As a remedy, we propose a novel two-stage learning objective that retargets the CATE to balance CATE estimation error and decision performance. We then propose a neural method that optimizes an adaptively-smoothed approximation of our learning objective. Finally, we confirm the effectiveness of our method both empirically and theoretically. In sum, our work is the first to show how two-stage CATE estimators can be adapted for optimal decision-making.