Causal Post-Processing of Predictive Models

📅 2024-06-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In real-world decision-making, predictive models are frequently misapplied for causal interventions (e.g., personalized recommendation, precision medicine), despite their optimization objectives being misaligned with causal effect estimation; meanwhile, scarcity of experimental data hinders training dedicated causal models. This paper proposes Causal Post-Processing (CPP), a framework that calibrates prediction scores using minimal experimental data—without retraining the original predictive model—to enable causal effect estimation, individual-level ranking, and threshold-based decisions. We systematically introduce three CPP paradigms—monotonicity-constrained, bias-correcting, and model-driven—achieving tunable trade-offs between statistical efficiency and modeling flexibility via monotonic regression, calibration learning, and parametric effect modeling. Empirical and simulation studies demonstrate that CPP significantly outperforms conventional causal methods under data scarcity, while offering high scalability and data efficiency.

Technology Category

Application Category

📝 Abstract
Decision makers across various domains rely on predictive models to guide individual-level intervention decisions. However, these models are typically trained to predict outcomes rather than causal effects, leading to misalignments when they are used for causal decision making. Experimental data to train effective causal effect models often is limited. To address this issue, we propose causal post-processing (CPP), a family of techniques for refining predictive scores to better align with causal effects using limited experimental data. Rather than training separate causal models for each intervention, causal post-processing can adapt existing predictive scores to support different decision-making requirements, such as estimating effect sizes, ranking individuals by expected effects, or classifying individuals based on an intervention threshold. We introduce three main CPP approaches -- monotonic post-processing, correction post-processing, and model-based post-processing -- each balancing statistical efficiency and flexibility differently. Through simulations and an empirical application in advertising, we demonstrate that causal post-processing improves intervention decisions, particularly in settings where experimental data is expensive or difficult to obtain at scale. Our findings highlight the advantages of integrating non-causal predictive models with experimental data, rather than treating them as competing alternatives, which provides a scalable and data-efficient approach to causal inference for decision making.
Problem

Research questions and friction points this paper is trying to address.

Align predictive models with causal effects using limited data
Adapt predictive scores for diverse decision-making requirements
Improve intervention decisions in data-scarce experimental settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

CPP refines predictive scores for causal effects
Three CPP approaches balance efficiency and flexibility
CPP integrates predictive models with experimental data
🔎 Similar Papers
No similar papers found.
C
Carlos Fern'andez-Lor'ia
Hong Kong University of Science and Technology
Y
Yanfang Hou
Hong Kong University of Science and Technology
F
F. Provost
New York University
Jennifer Hill
Jennifer Hill
New York University
StatisticsCausal InferenceBayesian non-parametricsMachine LearningMissing Data