PDSL: Privacy-Preserved Decentralized Stochastic Learning with Heterogeneous Data Distribution

📅 2025-03-31

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

To address the dual challenges of poor model robustness due to data heterogeneity and privacy leakage from gradient exchange in decentralized learning, this paper proposes the first collaborative learning framework integrating Shapley values with differential privacy. Methodologically, it introduces Shapley values into decentralized settings for the first time to quantify heterogeneous neighbors’ contributions to the global model, thereby enabling contribution-aware weighted aggregation; during gradient exchange, it injects calibrated noise satisfying ε-differential privacy to rigorously protect individual data privacy. Theoretically, we prove convergence under non-convex assumptions and formal ε-differential privacy guarantees. Empirically, the framework significantly improves both convergence speed and accuracy on heterogeneous data, while achieving an optimal trade-off between privacy budget and model utility.

Technology Category

Application Category

📝 Abstract

In the paradigm of decentralized learning, a group of agents collaborates to learn a global model using distributed datasets without a central server. However, due to the heterogeneity of the local data across the different agents, learning a robust global model is rather challenging. Moreover, the collaboration of the agents relies on their gradient information exchange, which poses a risk of privacy leakage. In this paper, to address these issues, we propose PDSL, a novel privacy-preserved decentralized stochastic learning algorithm with heterogeneous data distribution. On one hand, we innovate in utilizing the notion of Shapley values such that each agent can precisely measure the contributions of its heterogeneous neighbors to the global learning goal; on the other hand, we leverage the notion of differential privacy to prevent each agent from suffering privacy leakage when it contributes gradient information to its neighbors. We conduct both solid theoretical analysis and extensive experiments to demonstrate the efficacy of our PDSL algorithm in terms of privacy preservation and convergence.

Problem

Research questions and friction points this paper is trying to address.

Decentralized learning with heterogeneous data distribution challenges robustness

Gradient information exchange risks privacy leakage in collaborative agents

Proposing PDSL to measure contributions and preserve privacy effectively

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Shapley values for contribution measurement

Applies differential privacy for gradient protection

Decentralized stochastic learning with heterogeneous data

🔎 Similar Papers

On the effects of similarity metrics in decentralized deep learning under distributional shift