🤖 AI Summary
Federated learning (FL) faces severe data poisoning attacks under open participation, where existing defenses are largely passive, computationally expensive, and rely on the majority-honest assumption. This paper proposes a lightweight Bayesian incentive mechanism—the first to incorporate Bayesian game-theoretic modeling into FL incentive design—enabling proactive defense by rendering data poisoning economically unprofitable. The mechanism leverages private validation-set quality assessment and dynamic reward-penalty allocation to align client incentives with honest behavior. It satisfies individual rationality and incentive compatibility, operates within a controllable budget, and integrates seamlessly into standard FL frameworks. Under a 50% malicious client attack, our method achieves 96.7% test accuracy on MNIST—outperforming FedAvg by 51.7 percentage points—demonstrating substantial improvements in global model robustness and training efficiency.
📝 Abstract
Federated learning (FL) enables collaborative model training across decentralized clients while preserving data privacy. However, its open-participation nature exposes it to data-poisoning attacks, in which malicious actors submit corrupted model updates to degrade the global model. Existing defenses are often reactive, relying on statistical aggregation rules that can be computationally expensive and that typically assume an honest majority. This paper introduces a proactive, economic defense: a lightweight Bayesian incentive mechanism that makes malicious behavior economically irrational. Each training round is modeled as a Bayesian game of incomplete information in which the server, acting as the principal, uses a small, private validation dataset to verify update quality before issuing payments. The design satisfies Individual Rationality (IR) for benevolent clients, ensuring their participation is profitable, and Incentive Compatibility (IC), making poisoning an economically dominated strategy. Extensive experiments on non-IID partitions of MNIST and FashionMNIST demonstrate robustness: with 50% label-flipping adversaries on MNIST, the mechanism maintains 96.7% accuracy, only 0.3 percentage points lower than in a scenario with 30% label-flipping adversaries. This outcome is 51.7 percentage points better than standard FedAvg, which collapses under the same 50% attack. The mechanism is computationally light, budget-bounded, and readily integrates into existing FL frameworks, offering a practical route to economically robust and sustainable FL ecosystems.