Uncovering Gradient Inversion Risks in Practical Language Model Training

📅 2025-07-28

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

This work uncovers an underappreciated gradient inversion privacy risk for language models (LMs) in federated learning (FL): despite the common assumption that text discreteness inherently mitigates such attacks, we demonstrate that LMs remain highly vulnerable under realistic FL training conditions. To address this, we propose Grab—the first hybrid-optimization attack tailored to LMs—jointly optimizing layer-wise dropout masks and discrete token sequences by integrating continuous-space gradient-based optimization with discrete-space combinatorial search. This design effectively overcomes FL’s inherent stochasticity and token discreteness. Experiments across benchmark and realistic FL settings show that Grab achieves a 92.9% input recovery rate, outperforming existing baselines by 48.5%. Our results expose a substantial, previously underestimated privacy leakage in FL training of language models.

Technology Category

Application Category

📝 Abstract

The gradient inversion attack has been demonstrated as a significant privacy threat to federated learning (FL), particularly in continuous domains such as vision models. In contrast, it is often considered less effective or highly dependent on impractical training settings when applied to language models, due to the challenges posed by the discrete nature of tokens in text data. As a result, its potential privacy threats remain largely underestimated, despite FL being an emerging training method for language models. In this work, we propose a domain-specific gradient inversion attack named Grab (gradient inversion with hybrid optimization). Grab features two alternating optimization processes to address the challenges caused by practical training settings, including a simultaneous optimization on dropout masks between layers for improved token recovery and a discrete optimization for effective token sequencing. Grab can recover a significant portion (up to 92.9% recovery rate) of the private training data, outperforming the attack strategy of utilizing discrete optimization with an auxiliary model by notable improvements of up to 28.9% recovery rate in benchmark settings and 48.5% recovery rate in practical settings. Grab provides a valuable step forward in understanding this privacy threat in the emerging FL training mode of language models.

Problem

Research questions and friction points this paper is trying to address.

Assessing gradient inversion risks in language model training

Overcoming discrete token challenges in privacy attacks

Improving data recovery rates in federated learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid optimization for gradient inversion

Alternating optimization on dropout masks

Discrete optimization for token sequencing

🔎 Similar Papers

Spike No More: Stabilizing the Pre-training of Large Language Models