Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS

📅 2025-08-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
LLM-based multi-agent systems (LLM-MAS) rely on inter-agent communication to accomplish complex tasks, yet their communication channels suffer from severe security vulnerabilities. Existing attacks either compromise agents’ internal structures or depend on explicit persuasion, exhibiting low efficacy, poor adaptability, and weak stealth. Method: This paper proposes MAST, a Multi-round Adaptive Stealthy Tampering framework that shifts the attack focus from agent internals to communication messages themselves—marking the first such approach. MAST integrates Monte Carlo Tree Search with direct preference optimization to train adaptive attack policies, and enforces dual constraints—semantic similarity and embedding-space proximity—to ensure high stealth and robust adaptability across multi-turn interactions. Contribution/Results: Extensive experiments across diverse tasks, agent architectures, and large language models demonstrate that MAST significantly improves both attack success rate and stealth. These findings underscore the critical need for—and identify key pathways toward—securing the communication layer in LLM-MAS.

Technology Category

Application Category

📝 Abstract
Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines. These findings highlight the effectiveness, stealthiness, and adaptability of MAST, underscoring the need for robust communication safeguards in LLM-MAS.
Problem

Research questions and friction points this paper is trying to address.

Exploiting communication vulnerabilities in LLM-MAS
Enhancing attack stealthiness with adaptive strategies
Addressing limitations of existing overt persuasion methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-round adaptive stealthy tampering framework
Monte Carlo Tree Search with Direct Preference Optimization
Dual semantic and embedding similarity constraints
🔎 Similar Papers
B
Bingyu Yan
School of Cyber Science and Technology, Beihang University, Beijing, China
Z
Ziyi Zhou
School of Cyber Science and Technology, Beihang University, Beijing, China
X
Xiaoming Zhang
School of Cyber Science and Technology, Beihang University, Beijing, China
Chaozhuo Li
Chaozhuo Li
Microsoft Research Aisa
R
Ruilin Zeng
School of Cyber Science and Technology, Beihang University, Beijing, China
Y
Yirui Qi
School of Cyber Science and Technology, Beihang University, Beijing, China
T
Tianbo Wang
School of Cyber Science and Technology, Beihang University, Beijing, China
Litian Zhang
Litian Zhang
Beihang University