Graph Representation Learning Augmented Model Manipulation on Federated Fine-Tuning of LLMs

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

201K/year
🤖 AI Summary
This work addresses the security threat posed by malicious participants in federated fine-tuning of large language models, who can degrade global performance through manipulative updates. To this end, the paper proposes AugMP, a novel attack strategy that, for the first time, integrates graph representation learning into federated fine-tuning. AugMP employs graph neural networks to model feature correlations among benign updates, thereby generating malicious updates that are both highly effective and stealthy. It further leverages an augmented Lagrangian dual optimization framework to embed adversarial objectives while preserving the statistical characteristics of benign parameter distributions. Experimental results demonstrate that AugMP can reduce global accuracy by up to 26% and local proxy accuracy by up to 22% across multiple large language models, while effectively evading mainstream defenses based on distance- or similarity-based detection mechanisms.
📝 Abstract
Federated fine-tuning (FFT) has emerged as a privacy-preserving paradigm for collaboratively adapting large language models (LLMs). Built upon federated learning, FFT enables distributed agents to jointly refine a shared pretrained LLM by aggregating local LLM updates without sharing local raw data. However, FFT-based LLMs remain vulnerable to model manipulation threats, in which adversarial participants upload manipulated LLM updates that corrupt the aggregation process and degrade the performance of the global LLM. In this paper, we propose an Augmented Model maniPulation (AugMP) strategy against FFT-based LLMs. Specifically, we design a novel graph representation learning framework that captures feature correlations among benign LLM updates to guide the generation of malicious updates. To enhance manipulation effectiveness and stealthiness, we develop an iterative manipulation algorithm based on an augmented Lagrangian dual formulation. Through this formulation, malicious updates are optimized to embed adversarial objectives while preserving benign-like parameter characteristics. Experimental results across multiple LLM backbones demonstrate that the AugMP strategy achieves the strongest manipulation performance among all competing baselines, reducing the global LLM accuracy by up to 26% and degrading the average accuracy of local LLM agents by up to 22%. Meanwhile, AugMP maintains high statistical and geometric consistency with benign updates, enabling it to evade conventional distance- and similarity-based defense methods.
Problem

Research questions and friction points this paper is trying to address.

federated fine-tuning
model manipulation
large language models
adversarial attacks
privacy-preserving learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Representation Learning
Federated Fine-Tuning
Model Manipulation
Adversarial Attack
Augmented Lagrangian
🔎 Similar Papers
H
Hanlin Cai
Centre for neXt Communications (CXC), Department of Engineering, University of Cambridge, CB3 0FA Cambridge, U.K.
K
Kai Li
Centre for neXt Communications (CXC), Department of Engineering, University of Cambridge, CB3 0FA Cambridge, U.K.; and Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
H
Houtianfu Wang
Centre for neXt Communications (CXC), Department of Engineering, University of Cambridge, CB3 0FA Cambridge, U.K.
H
Haofan Dong
Centre for neXt Communications (CXC), Department of Engineering, University of Cambridge, CB3 0FA Cambridge, U.K.
Y
Yichen Li
Centre for neXt Communications (CXC), Department of Engineering, University of Cambridge, CB3 0FA Cambridge, U.K.
Falko Dressler
Falko Dressler
Technische Universität Berlin
internet of thingsvehicular networks5G/6Gedge computingmolecular communication
O
Ozgur B. Akan
Centre for neXt Communications (CXC), Department of Engineering, University of Cambridge, CB3 0FA Cambridge, U.K.; and Center for neXt-Generation Communications (CXC), Department of Electrical and Electronics Engineering, Koç University, 34450 Istanbul, Türkiye