š¤ AI Summary
The practical feasibility of gradient leakage attacks (GLAs) in federated learning (FL) remains contentious, with prior work demonstrating success only under unrealistic assumptionsāe.g., small batch sizes and known data distributions. This paper identifies gradient matching difficulty as the primary cause of GLA failure in realistic FL settings. To address this, we propose FedLeak, the first framework enabling partial gradient matching coupled with L2/L1 gradient regularizationāeliminating reliance on strong assumptions such as small batches or distributional priors. Under standard FL conditionsāincluding heterogeneous data, large batch sizes, and no knowledge of underlying data distributionsāFedLeak achieves robust, high-fidelity reconstruction of original training samples. It significantly improves both image and text recovery accuracy and attack success rates. Our results provide empirically grounded, realistic evidence for privacy risk assessment in practical FL deployments.
š Abstract
Federated learning (FL) enables collaborative model training among multiple clients without the need to expose raw data. Its ability to safeguard privacy, at the heart of FL, has recently been a hot-button debate topic. To elaborate, several studies have introduced a type of attacks known as gradient leakage attacks (GLAs), which exploit the gradients shared during training to reconstruct clients' raw data. On the flip side, some literature, however, contends no substantial privacy risk in practical FL environments due to the effectiveness of such GLAs being limited to overly relaxed conditions, such as small batch sizes and knowledge of clients' data distributions. This paper bridges this critical gap by empirically demonstrating that clients' data can still be effectively reconstructed, even within realistic FL environments. Upon revisiting GLAs, we recognize that their performance failures stem from their inability to handle the gradient matching problem. To alleviate the performance bottlenecks identified above, we develop FedLeak, which introduces two novel techniques, partial gradient matching and gradient regularization. Moreover, to evaluate the performance of FedLeak in real-world FL environments, we formulate a practical evaluation protocol grounded in a thorough review of extensive FL literature and industry practices. Under this protocol, FedLeak can still achieve high-fidelity data reconstruction, thereby underscoring the significant vulnerability in FL systems and the urgent need for more effective defense methods.