🤖 AI Summary
This paper investigates the accuracy–fairness trade-off in fair representation learning (FRL) for supervised learning, aiming to preserve predictive accuracy for the target variable (Y) while mitigating dependence on sensitive attributes (S). It unifies three major fairness criteria—independence, separation (equalized odds), and calibration—within a single analytical framework. To this end, the authors introduce the kernelized equalized odds criterion (mathrm{EO}_k), the first differentiable and statistically estimable measure that jointly quantifies fairness and accuracy. They design an empirical estimator (hat{mathrm{EO}}_k) computable in quadratic time, with a linear-time approximation, and derive its concentration inequality to provide finite-sample error bounds and formal fairness certification. Theoretically, the work establishes precise analytical relationships among the three fairness notions, yielding a verifiable, robust paradigm for fair learning backed by rigorous statistical guarantees.
📝 Abstract
This paper introduces a novel kernel-based formulation of the Equalized Odds (EO) criterion, denoted as $EO_k$, for fair representation learning (FRL) in supervised settings. The central goal of FRL is to mitigate discrimination regarding a sensitive attribute $S$ while preserving prediction accuracy for the target variable $Y$. Our proposed criterion enables a rigorous and interpretable quantification of three core fairness objectives: independence (prediction $hat{Y}$ is independent of $S$), separation (also known as equalized odds; prediction $hat{Y}$ is independent with $S$ conditioned on target attribute $Y$), and calibration ($Y$ is independent of $S$ conditioned on the prediction $hat{Y}$). Under both unbiased ($Y$ is independent of $S$) and biased ($Y$ depends on $S$) conditions, we show that $EO_k$ satisfies both independence and separation in the former, and uniquely preserves predictive accuracy while lower bounding independence and calibration in the latter, thereby offering a unified analytical characterization of the tradeoffs among these fairness criteria. We further define the empirical counterpart, $hat{EO}_k$, a kernel-based statistic that can be computed in quadratic time, with linear-time approximations also available. A concentration inequality for $hat{EO}_k$ is derived, providing performance guarantees and error bounds, which serve as practical certificates of fairness compliance. While our focus is on theoretical development, the results lay essential groundwork for principled and provably fair algorithmic design in future empirical studies.