Cutting Through Privacy: A Hyperplane-Based Data Reconstruction Attack in Federated Learning

📅 2025-05-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In federated learning (FL), malicious servers can reconstruct clients’ private data from model updates; however, existing reconstruction attacks rely on strong assumptions about data distribution and fail for large batch sizes (>几十 samples). This work proposes a prior-free, distribution-agnostic reconstruction attack. We introduce a novel geometric perspective on fully connected layer weights—interpreting them as hyperplanes—and jointly design malicious parameters, gradient inversion mappings, and classification-feature decoupling to enable high-fidelity reconstruction of arbitrarily large batches (up to thousands of samples). Evaluated on image and tabular datasets, our method achieves significantly higher reconstruction fidelity than state-of-the-art approaches, scaling batch size by over two orders of magnitude. These results expose severe privacy vulnerabilities in FL under realistic, non-ideal server assumptions.

Technology Category

Application Category

📝 Abstract
Federated Learning (FL) enables collaborative training of machine learning models across distributed clients without sharing raw data, ostensibly preserving data privacy. Nevertheless, recent studies have revealed critical vulnerabilities in FL, showing that a malicious central server can manipulate model updates to reconstruct clients' private training data. Existing data reconstruction attacks have important limitations: they often rely on assumptions about the clients' data distribution or their efficiency significantly degrades when batch sizes exceed just a few tens of samples. In this work, we introduce a novel data reconstruction attack that overcomes these limitations. Our method leverages a new geometric perspective on fully connected layers to craft malicious model parameters, enabling the perfect recovery of arbitrarily large data batches in classification tasks without any prior knowledge of clients' data. Through extensive experiments on both image and tabular datasets, we demonstrate that our attack outperforms existing methods and achieves perfect reconstruction of data batches two orders of magnitude larger than the state of the art.
Problem

Research questions and friction points this paper is trying to address.

Overcoming limitations in federated learning data reconstruction attacks
Enabling perfect recovery of large data batches without prior knowledge
Improving attack efficiency on image and tabular datasets significantly
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages geometric perspective on fully connected layers
Enables perfect recovery of large data batches
Requires no prior knowledge of clients' data