Secure Shapley Value for Cross-Silo Federated Learning

📅 2022-09-11

🏛️ Proceedings of the VLDB Endowment

📈 Citations: 20

✨ Influential: 1

career value

207K/year

🤖 AI Summary

To address privacy risks in cross-institutional federated learning (FL) arising from Shapley value computation—where raw local models and private test data must be exposed—this paper proposes SecSV, the first secure two-server protocol that requires neither access to participants’ original models nor their private test samples. SecSV integrates homomorphic encryption with two-server secure multi-party computation (MPC), designs an efficient and secure matrix multiplication primitive, and introduces a sample-adaptive skipping strategy to reduce communication and computational overhead. Evaluated on mainstream FL benchmarks, SecSV achieves 7.2–36.6× speedup over the baseline HESV while maintaining bounded Shapley value estimation error. It simultaneously ensures high efficiency and strong privacy guarantees, enabling the first trustworthy contribution evaluation under strict model-and-data isolation.

📝 Abstract

The Shapley value (SV) is a fair and principled metric for contribution evaluation in cross-silo federated learning (cross-silo FL), wherein organizations, i.e., clients, collaboratively train prediction models with the coordination of a parameter server. However, existing SV calculation methods for FL assume that the server can access the raw FL models and public test data. This may not be a valid assumption in practice considering the emerging privacy attacks on FL models and the fact that test data might be clients' private assets. Hence, we investigate the problem of secure SV calculation for cross-silo FL. We first propose HESV , a one-server solution based solely on homomorphic encryption (HE) for privacy protection, which has limitations in efficiency. To overcome these limitations, we propose SecSV , an efficient two-server protocol with the following novel features. First, SecSV utilizes a hybrid privacy protection scheme to avoid ciphertext-ciphertext multiplications between test data and models, which are extremely expensive under HE. Second, an efficient secure matrix multiplication method is proposed for SecSV. Third, SecSV strategically identifies and skips some test samples without significantly affecting the evaluation accuracy. Our experiments demonstrate that SecSV is 7.2--36.6× as fast as HESV, with a limited loss in the accuracy of calculated SVs.

Problem

Research questions and friction points this paper is trying to address.

Secure Shapley Value calculation in cross-silo federated learning

Protecting privacy of FL models and clients' test data

Overcoming efficiency limitations in homomorphic encryption solutions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Secure two-server protocol with hybrid privacy

Efficient secure matrix multiplication method

Strategic test sample skipping for accuracy

🔎 Similar Papers

On the Volatility of Shapley-Based Contribution Metrics in Federated Learning