Federated Measurement of Demographic Disparities from Quantile Sketches

📅 2026-02-21
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
This work addresses the challenge of jointly assessing group fairness across data silos under privacy constraints, where local and global fairness metrics often diverge. The authors propose an efficient auditing method based on horizontal federated learning that enables accurate estimation of the global Wasserstein–FrĂ©chet variance without sharing raw data, requiring only grouped counts and quantile summaries of score distributions from each participant. By innovatively introducing an ANOVA-style decomposition of the Wasserstein distance, the method disentangles the contributions of selection bias and cross-silo heterogeneity to observed fairness disparities. The designed single-round, low-bias federated estimator integrates quantile sketches with nonparametric theory, achieving high-precision reconstruction and diagnosis of fairness gaps with as few as dozens of quantiles, as validated on both synthetic and COMPAS datasets, while maintaining low communication overhead and provable error bounds.

Technology Category

Application Category

📝 Abstract
Many fairness goals are defined at a population level that misaligns with siloed data collection, which remains unsharable due to privacy regulations. Horizontal federated learning (FL) enables collaborative modeling across clients with aligned features without sharing raw data. We study federated auditing of demographic parity through score distributions, measuring disparity as a Wasserstein--Frechet variance between sensitive-group score laws, and expressing the population metric in federated form that makes explicit how silo-specific selection drives local-global mismatch. For the squared Wasserstein distance, we prove an ANOVA-style decomposition that separates (i) selection-induced mixture effects from (ii) cross-silo heterogeneity, yielding tight bounds linking local and global metrics. We then propose a one-shot, communication-efficient protocol in which each silo shares only group counts and a quantile summary of its local score distributions, enabling the server to estimate global disparity and its decomposition, with $O(1/k)$ discretization bias ($k$ quantiles) and finite-sample guarantees. Experiments on synthetic data and COMPAS show that a few dozen quantiles suffice to recover global disparity and diagnose its sources.
Problem

Research questions and friction points this paper is trying to address.

federated learning
demographic disparity
fairness auditing
data silos
privacy-preserving
Innovation

Methods, ideas, or system contributions that make the work stand out.

federated auditing
demographic parity
Wasserstein distance
quantile sketches
ANOVA decomposition
🔎 Similar Papers
No similar papers found.
Arthur Charpentier
Arthur Charpentier
Université du Québec à Montréal
Riskinsurancepredictive modelingcomputational statisticsactuarial science
Agathe Fernandes Machado
Agathe Fernandes Machado
Université du Québec à Montréal
O
Olivier CÎté
Université Laval, Québec, Canada
F
François Hu
Milliman, Paris, France; Université Claude Bernard, Lyon, France