Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach

📅 2026-04-22

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

Existing membership inference attacks fail in federated large language models due to their massive parameter scale, rapid convergence, and sparse, non-orthogonal gradients. This work proposes ProjRes, the first passive attack that leverages projection residuals of hidden embeddings onto gradient subspaces to establish an intrinsic link between gradients and inputs—without requiring shadow models, auxiliary classifiers, or historical updates. ProjRes is architecture-agnostic and achieves near-perfect attack accuracy (approaching 100%) across four benchmarks and four prominent large language models, outperforming state-of-the-art methods by up to 75.75%. Notably, it remains highly effective even under strong differential privacy defenses.

Technology Category

Application Category

📝 Abstract

Federated Large Language Models (FedLLMs) enable multiple parties to collaboratively fine-tune LLMs without sharing raw data, addressing challenges of limited resources and privacy concerns. Despite data localization, shared gradients can still expose sensitive information through membership inference attacks (MIAs). However, FedLLMs' unique properties, i.e. massive parameter scales, rapid convergence, and sparse, non-orthogonal gradients, render existing MIAs ineffective. To address this gap, we propose ProjRes, the first projection residuals-based passive MIA tailored for FedLLMs. ProjRes leverages hidden embedding vectors as sample representations and analyzes their projection residuals on the gradient subspace to uncover the intrinsic link between gradients and inputs. It requires no shadow models, auxiliary classifiers, or historical updates, ensuring efficiency and robustness. Experiments on four benchmarks and four LLMs show that ProjRes achieves near 100% accuracy, outperforming prior methods by up to 75.75%, and remains effective even under strong differential privacy defenses. Our findings reveal a previously overlooked privacy vulnerability in FedLLMs and call for a re-examination of their security assumptions. Our code and data are available at $\href{https://anonymous.4open.science/r/Passive-MIA-5268}{link}$.

Problem

Research questions and friction points this paper is trying to address.

Membership Inference Attacks

Federated Large Language Models

Privacy Vulnerability

Gradient Leakage

Federated Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

membership inference attack

federated large language models

projection residual