🤖 AI Summary
In federated learning, Shapley-value-based contribution evaluation suffers from significant instability due to aggregation strategies and data heterogeneity, leading to inaccurate reward allocation and weakened participant incentives. This work is the first to systematically characterize the pronounced volatility of Shapley values under mainstream aggregation algorithms—including FedAvg and FedProx—and across both IID and non-IID data distributions. We propose a gradient-driven model reconstruction method to efficiently approximate Shapley values. Empirical results demonstrate that, under non-IID settings, the variance of Shapley estimates increases by up to 2.3×, and discrepancies across aggregation strategies exceed 40%. These findings challenge the implicit assumption of Shapley-value stability underlying existing fairness-aware incentive mechanisms. Our study provides critical empirical evidence and concrete directions for designing robust, aggregation-aware contribution evaluation frameworks in federated learning.
📝 Abstract
Federated learning (FL) is a collaborative and privacy-preserving Machine Learning paradigm, allowing the development of robust models without the need to centralize sensitive data. A critical challenge in FL lies in fairly and accurately allocating contributions from diverse participants. Inaccurate allocation can undermine trust, lead to unfair compensation, and thus participants may lack the incentive to join or actively contribute to the federation. Various remuneration strategies have been proposed to date, including auction-based approaches and Shapley-value-based methods, the latter offering a means to quantify the contribution of each participant. However, little to no work has studied the stability of these contribution evaluation methods. In this paper, we evaluate participant contributions in federated learning using gradient-based model reconstruction techniques with Shapley values and compare the round-based contributions to a classic data contribution measurement scheme. We provide an extensive analysis of the discrepancies of Shapley values across a set of aggregation strategies, and examine them on an overall and a per-client level. We show that, between different aggregation techniques, Shapley values lead to unstable reward allocations among participants. Our analysis spans various data heterogeneity distributions, including independent and identically distributed (IID) and non-IID scenarios.