đ€ AI Summary
This paper addresses group-level unfairness arising from multiple sensitive attributes (e.g., joint gender and race categories) in predictive AI systems. To this end, we propose a model-agnostic, interpretable, and scalable fairness rectification framework. Our method innovatively integrates optimal transport theory into multivariate fairness modelingâenabling, for the first time, decomposable quantification and sequential optimization of fairness disparities across multiple sensitive attribute combinations. The framework supports post-hoc fairness correction and provides global interpretability via attribution-based analysis. We release an open-source Python package with interactive visualization tools. Experiments on the US Census dataset demonstrate that our approach significantly improves multidimensional fairness metricsâincluding statistical parity across multiple demographic groupsâwhile preserving predictive accuracy. The method thus bridges practical applicability with theoretical rigor in fair machine learning.
đ Abstract
Algorithmic fairness has received considerable attention due to the failures of various predictive AI systems that have been found to be unfairly biased against subgroups of the population. Many approaches have been proposed to mitigate such biases in predictive systems, however, they often struggle to provide accurate estimates and transparent correction mechanisms in the case where multiple sensitive variables, such as a combination of gender and race, are involved. This paper introduces a new open source Python package, EquiPy, which provides a easy-to-use and model agnostic toolbox for efficiently achieving fairness across multiple sensitive variables. It also offers comprehensive graphic utilities to enable the user to interpret the influence of each sensitive variable within a global context. EquiPy makes use of theoretical results that allow the complexity arising from the use of multiple variables to be broken down into easier-to-solve sub-problems. We demonstrate the ease of use for both mitigation and interpretation on publicly available data derived from the US Census and provide sample code for its use.