🤖 AI Summary
Evaluating and ensuring strong fairness—such as multi-accuracy and multi-calibration—is challenging when sensitive attributes are missing, as conventional fairness frameworks assume full access to such labels.
Method: This paper pioneers the extension of proxy-sensitive attributes to non-equality-based fairness frameworks, proposing a theoretically grounded optimization framework that enforces multi-accuracy and multi-calibration constraints via proxy attributes. It rigorously characterizes how fairness guarantees on proxy groups propagate to true (unobserved) demographic groups.
Contribution/Results: (1) We derive a provable upper bound on the fairness deviation with respect to the true groups; (2) we prove that imposing fairness constraints on proxy groups effectively mitigates worst-case unfairness on unknown true groups. Empirical evaluation across multiple real-world datasets demonstrates significant reduction in inter-group prediction disparities. Our approach establishes a novel paradigm for fair machine learning in high-stakes settings where sensitive labels are unavailable.
📝 Abstract
As the use of predictive machine learning algorithms increases in high-stakes decision-making, it is imperative that these algorithms are fair across sensitive groups. Unfortunately, measuring and enforcing fairness in real-world applications can be challenging due to missing or incomplete sensitive group data. Proxy-sensitive attributes have been proposed as a practical and effective solution in these settings, but only for parity-based fairness notions. Knowing how to evaluate and control for fairness with missing sensitive group data for newer and more flexible frameworks, such as multiaccuracy and multicalibration, remains unexplored. In this work, we address this gap by demonstrating that in the absence of sensitive group data, proxy-sensitive attributes can provably be used to derive actionable upper bounds on the true multiaccuracy and multicalibration, providing insights into a model's potential worst-case fairness violations. Additionally, we show that adjusting models to satisfy multiaccuracy and multicalibration across proxy-sensitive attributes can significantly mitigate these violations for the true, but unknown, sensitive groups. Through several experiments on real-world datasets, we illustrate that approximate multiaccuracy and multicalibration can be achieved even when sensitive group information is incomplete or unavailable.