š¤ AI Summary
While federated learning (FL) avoids uploading raw data, model updatesāparticularly gradientsāremain vulnerable to privacy attacks such as gradient/model inversion and membership inference, risking sensitive training data leakage. Method: This paper systematically surveys privacy attacks in FL, analyzing their feasibility under realistic constraints (e.g., non-IID data, few clients); evaluates limitations of mainstream defensesāincluding differential privacy and secure aggregation; and integrates industrial deployment cases with global regulatory frameworks (GDPR, CCPA). Contribution/Results: We propose the first FL privacy risk taxonomy unifying adversarial theory, empirical failure analysis, and compliance requirements. Innovatively, we introduce deployment-oriented dimensions for evaluating defense effectiveness and deliver a trustworthy FL implementation roadmapābalancing model utility and privacy guaranteesāalongside a policy alignment guide for regulatory compliance.
š Abstract
Deep learning has shown incredible potential across a wide array of tasks, and accompanied by this growth has been an insatiable appetite for data. However, a large amount of data needed for enabling deep learning is stored on personal devices, and recent concerns on privacy have further highlighted challenges for accessing such data. As a result, federated learning (FL) has emerged as an important privacy-preserving technology that enables collaborative training of machine learning models without the need to send the raw, potentially sensitive, data to a central server. However, the fundamental premise that sending model updates to a server is privacy-preserving only holds if the updates cannot be"reverse engineered"to infer information about the private training data. It has been shown under a wide variety of settings that this privacy premise does not hold. In this survey paper, we provide a comprehensive literature review of the different privacy attacks and defense methods in FL. We identify the current limitations of these attacks and highlight the settings in which the privacy of ann FL client can be broken. We further dissect some of the successful industry applications of FL and draw lessons for future successful adoption. We survey the emerging landscape of privacy regulation for FL and conclude with future directions for taking FL toward the cherished goal of generating accurate models while preserving the privacy of the data from its participants.