Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach

πŸ“… 2026-04-22
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

209K/year
πŸ€– AI Summary
Existing membership inference attacks fail in federated large language models due to their massive parameter scale, rapid convergence, and sparse, non-orthogonal gradients. This work proposes ProjRes, the first passive attack that leverages projection residuals of hidden embeddings onto gradient subspaces to establish an intrinsic link between gradients and inputsβ€”without requiring shadow models, auxiliary classifiers, or historical updates. ProjRes is architecture-agnostic and achieves near-perfect attack accuracy (approaching 100%) across four benchmarks and four prominent large language models, outperforming state-of-the-art methods by up to 75.75%. Notably, it remains highly effective even under strong differential privacy defenses.

Technology Category

Application Category

πŸ“ Abstract
Federated Large Language Models (FedLLMs) enable multiple parties to collaboratively fine-tune LLMs without sharing raw data, addressing challenges of limited resources and privacy concerns. Despite data localization, shared gradients can still expose sensitive information through membership inference attacks (MIAs). However, FedLLMs' unique properties, i.e. massive parameter scales, rapid convergence, and sparse, non-orthogonal gradients, render existing MIAs ineffective. To address this gap, we propose ProjRes, the first projection residuals-based passive MIA tailored for FedLLMs. ProjRes leverages hidden embedding vectors as sample representations and analyzes their projection residuals on the gradient subspace to uncover the intrinsic link between gradients and inputs. It requires no shadow models, auxiliary classifiers, or historical updates, ensuring efficiency and robustness. Experiments on four benchmarks and four LLMs show that ProjRes achieves near 100% accuracy, outperforming prior methods by up to 75.75%, and remains effective even under strong differential privacy defenses. Our findings reveal a previously overlooked privacy vulnerability in FedLLMs and call for a re-examination of their security assumptions. Our code and data are available at $\href{https://anonymous.4open.science/r/Passive-MIA-5268}{link}$.
Problem

Research questions and friction points this paper is trying to address.

Membership Inference Attacks
Federated Large Language Models
Privacy Vulnerability
Gradient Leakage
Federated Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

membership inference attack
federated large language models
projection residual
gradient subspace
privacy vulnerability
G
Guilin Deng
Colleague of Computer Science and Technology, National University of Defense Technology, Changsha, China
S
Silong Chen
Colleague of Computer Science and Technology, National University of Defense Technology, Changsha, China
Y
Yuchuan Luo
Colleague of Computer Science and Technology, National University of Defense Technology, Changsha, China
Yi Liu
Yi Liu
Department of Computer Science, City University of Hong Kong
Security and PrivacyFederated LearningAI Security
S
Songlei Wang
Shenzhen University, Shenzhen, China
Z
Zhiping Cai
Colleague of Computer Science and Technology, National University of Defense Technology, Changsha, China
L
Lin Liu
Colleague of Computer Science and Technology, National University of Defense Technology, Changsha, China
Xiaohua Jia
Xiaohua Jia
Chinese Academy of Science
S
Shaojing Fu
Colleague of Computer Science and Technology, National University of Defense Technology, Changsha, China