Robust Design and Evaluation of Predictive Algorithms under Unobserved Confounding

📅 2022-12-19
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
In selective labeling scenarios, unobserved confounding—arising because only outcomes of decision-maker-selected samples are observed—induces causal identification bias, undermining both predictive fairness and the reliability of risk assessment. This paper introduces the first falsifiable framework for robust prediction evaluation: (1) it unifies formalizations of confounding-bound strategies, including proxy and instrumental variables; (2) it enables bounding of arbitrary performance metrics (e.g., TPR, MSE, conditional likelihood); and (3) it quantifies how confounding assumptions affect fairness conclusions across demographic groups. Empirically validated on data from a major Australian financial institution, the framework reveals up to a 32% disparity in credit risk predictions under alternative confounding assumptions, with fairness assessments for low-income groups reversing entirely. These results underscore the necessity of robust evaluation for high-stakes decision-making.
📝 Abstract
Predictive algorithms inform consequential decisions in settings with selective labels: outcomes are observed only for units selected by past decision makers. This creates an identification problem under unobserved confounding -- when selected and unselected units differ in unobserved ways that affect outcomes. We propose a framework for robust design and evaluation of predictive algorithms that bounds how much outcomes may differ between selected and unselected units with the same observed characteristics. These bounds formalize common empirical strategies including proxy outcomes and instrumental variables. Our estimators work across bounding strategies and performance measures such as conditional likelihoods, mean square error, and true/false positive rates. Using administrative data from a large Australian financial institution, we show that varying confounding assumptions substantially affects credit risk predictions and fairness evaluations across income groups.
Problem

Research questions and friction points this paper is trying to address.

Addresses predictive algorithm evaluation under unobserved confounding
Proposes framework bounding outcome differences between selected/unselected units
Demonstrates confounding impact on credit risk and fairness evaluations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bounding outcome differences between selected and unselected units
Formalizing proxy outcomes and instrumental variables strategies
Developing estimators across bounding strategies and performance measures
🔎 Similar Papers
No similar papers found.