🤖 AI Summary
In differentially private (DP) machine learning, significant performance disparities across demographic groups degrade fairness, yet the underlying causes remain poorly understood. Method: We conduct a systematic, causal analysis of the ML pipeline, integrating DP mechanisms, fairness metrics, and empirical paradigms to decompose sources of unfairness. We propose the first end-to-end causal taxonomy for unfairness in DP learning, unifying fragmented definitions of fairness and privacy. Contribution/Results: Our analysis identifies data distribution shift and training set bias as primary drivers of disparate impact—outweighing algorithmic or optimization factors. Crucially, we demonstrate that data-layer properties dominate unfair outcomes. Building on this, we formulate distribution-aware principles for privacy-preserving algorithm design and provide actionable, implementation-ready mitigation strategies. This work bridges causal reasoning with DP fairness, offering both theoretical insight and practical guidance for equitable private learning.
📝 Abstract
Differential privacy has emerged as the most studied framework for privacy-preserving machine learning. However, recent studies show that enforcing differential privacy guarantees can not only significantly degrade the utility of the model, but also amplify existing disparities in its predictive performance across demographic groups. Although there is extensive research on the identification of factors that contribute to this phenomenon, we still lack a complete understanding of the mechanisms through which differential privacy exacerbates disparities. The literature on this problem is muddled by varying definitions of fairness, differential privacy mechanisms, and inconsistent experimental settings, often leading to seemingly contradictory results. This survey provides the first comprehensive overview of the factors that contribute to the disparate effect of training models with differential privacy guarantees. We discuss their impact and analyze their causal role in such a disparate effect. Our analysis is guided by a taxonomy that categorizes these factors by their position within the machine learning pipeline, allowing us to draw conclusions about their interaction and the feasibility of potential mitigation strategies. We find that factors related to the training dataset and the underlying distribution play a decisive role in the occurrence of disparate impact, highlighting the need for research on these factors to address the issue.