🤖 AI Summary
In federated learning, skewed local data distributions across clients induce fairness conflicts across multiple sensitive attributes, yet existing approaches are largely confined to single-attribute fairness and lack the capability to assess fairness disparities across heterogeneous clients. To address this, we propose FeDa4Fair: an open-source benchmark framework enabling dual-level (global and client-specific) fairness evaluation. Its core innovation lies in four publicly available tabular datasets exhibiting diverse, controlled heterogeneity—generated via a configurable mechanism to simulate realistic client-level biases—alongside integrated fairness mitigation algorithms and ready-to-use evaluation functions. FeDa4Fair is the first framework to support fine-grained, reproducible, quantitative analysis of fairness across multiple sensitive attributes and client-specific fairness requirements. It significantly enhances rigor and comparability in federated fairness research.
📝 Abstract
Federated Learning (FL) enables collaborative model training across multiple clients without sharing clients' private data. However, fairness remains a key concern, as biases in local clients' datasets can impact the entire federated system. Heterogeneous data distributions across clients may lead to models that are fairer for some clients than others. Although several fairness-enhancing solutions are present in the literature, most focus on mitigating bias for a single sensitive attribute, typically binary, overlooking the diverse and sometimes conflicting fairness needs of different clients. This limited perspective can limit the effectiveness of fairness interventions for the different clients. To support more robust and reproducible fairness research in FL, we aim to enable a consistent benchmarking of fairness-aware FL methods at both the global and client levels. In this paper, we contribute in three ways: (1) We introduce FeDa4Fair, a library to generate tabular datasets tailored to evaluating fair FL methods under heterogeneous client bias; (2) we release four bias-heterogeneous datasets and corresponding benchmarks to compare fairness mitigation methods in a controlled environment; (3) we provide ready-to-use functions for evaluating fairness outcomes for these datasets.