🤖 AI Summary
Existing data reconstruction attacks (DRAs) in federated learning lack a theoretically grounded framework for quantifying privacy risks. Method: This paper introduces the first theory-driven risk quantification framework based on differentiable inverse mapping modeling, proposing Invertibility Loss (InvLoss)—a per-sample upper bound on DRA capability—and deriving its tight analytical upper bound. Through Jacobian spectral analysis, it uncovers a unified mechanism whereby singular value distribution governs DRA risk. It further designs InvRE, an attack-agnostic risk estimator, and an adaptive noise defense strategy. Results: Evaluated on multiple real-world datasets, InvRE achieves <5% risk estimation error; the proposed defense reduces DRA success rate by over 80% without any model accuracy degradation, achieving Pareto-optimal trade-offs between privacy protection and model utility.
📝 Abstract
Data Reconstruction Attacks (DRA) pose a significant threat to Federated Learning (FL) systems by enabling adversaries to infer sensitive training data from local clients. Despite extensive research, the question of how to characterize and assess the risk of DRAs in FL systems remains unresolved due to the lack of a theoretically-grounded risk quantification framework. In this work, we address this gap by introducing Invertibility Loss (InvLoss) to quantify the maximum achievable effectiveness of DRAs for a given data instance and FL model. We derive a tight and computable upper bound for InvLoss and explore its implications from three perspectives. First, we show that DRA risk is governed by the spectral properties of the Jacobian matrix of exchanged model updates or feature embeddings, providing a unified explanation for the effectiveness of defense methods. Second, we develop InvRE, an InvLoss-based DRA risk estimator that offers attack method-agnostic, comprehensive risk evaluation across data instances and model architectures. Third, we propose two adaptive noise perturbation defenses that enhance FL privacy without harming classification accuracy. Extensive experiments on real-world datasets validate our framework, demonstrating its potential for systematic DRA risk evaluation and mitigation in FL systems.