🤖 AI Summary
Existing fair machine learning methods overlook the impact of human decision-making on downstream outcomes, thus failing to mitigate real-world outcome inequality arising from representational bias. To address this, we introduce the novel concept of “representation mismatch,” formally modeling the causal discrepancy between actual human decisions and idealized counterfactual decisions. We cast this mismatch as an intervenable multi-objective optimization problem within a neural network framework. Theoretically, our learned interpretable weights provably eliminate downstream outcome inequality, enabling decision-maker–oriented explainable interventions (e.g., behavioral nudges). Our method integrates causal simplifying assumptions with weight-based representation modeling. Empirical evaluation on German Credit, Adult, and Heritage Health datasets confirms both the identifiability and remediability of representation mismatch, achieving complete mitigation of downstream inequality.
📝 Abstract
We propose a fair machine learning algorithm to model interpretable differences between observed and desired human decision-making, with the latter aimed at reducing disparity in a downstream outcome impacted by the human decision. Prior work learns fair representations without considering the outcome in the decision-making process. We model the outcome disparities as arising due to the different representations of the input seen by the observed and desired decision-maker, which we term representational disparities. Our goal is to learn interpretable representational disparities which could potentially be corrected by specific nudges to the human decision, mitigating disparities in the downstream outcome; we frame this as a multi-objective optimization problem using a neural network. Under reasonable simplifying assumptions, we prove that our neural network model of the representational disparity learns interpretable weights that fully mitigate the outcome disparity. We validate objectives and interpret results using real-world German Credit, Adult, and Heritage Health datasets.