đ€ AI Summary
This work addresses the challenge of jointly assessing group fairness across data silos under privacy constraints, where local and global fairness metrics often diverge. The authors propose an efficient auditing method based on horizontal federated learning that enables accurate estimation of the global WassersteinâFrĂ©chet variance without sharing raw data, requiring only grouped counts and quantile summaries of score distributions from each participant. By innovatively introducing an ANOVA-style decomposition of the Wasserstein distance, the method disentangles the contributions of selection bias and cross-silo heterogeneity to observed fairness disparities. The designed single-round, low-bias federated estimator integrates quantile sketches with nonparametric theory, achieving high-precision reconstruction and diagnosis of fairness gaps with as few as dozens of quantiles, as validated on both synthetic and COMPAS datasets, while maintaining low communication overhead and provable error bounds.
đ Abstract
Many fairness goals are defined at a population level that misaligns with siloed data collection, which remains unsharable due to privacy regulations. Horizontal federated learning (FL) enables collaborative modeling across clients with aligned features without sharing raw data. We study federated auditing of demographic parity through score distributions, measuring disparity as a Wasserstein--Frechet variance between sensitive-group score laws, and expressing the population metric in federated form that makes explicit how silo-specific selection drives local-global mismatch. For the squared Wasserstein distance, we prove an ANOVA-style decomposition that separates (i) selection-induced mixture effects from (ii) cross-silo heterogeneity, yielding tight bounds linking local and global metrics. We then propose a one-shot, communication-efficient protocol in which each silo shares only group counts and a quantile summary of its local score distributions, enabling the server to estimate global disparity and its decomposition, with $O(1/k)$ discretization bias ($k$ quantiles) and finite-sample guarantees. Experiments on synthetic data and COMPAS show that a few dozen quantiles suffice to recover global disparity and diagnose its sources.