Attack the Messages, Not the Agents: A Multi-round Adaptive Stealthy Tampering Framework for LLM-MAS

📅 2025-08-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

LLM-based multi-agent systems (LLM-MAS) rely on inter-agent communication to accomplish complex tasks, yet their communication channels suffer from severe security vulnerabilities. Existing attacks either compromise agents’ internal structures or depend on explicit persuasion, exhibiting low efficacy, poor adaptability, and weak stealth. Method: This paper proposes MAST, a Multi-round Adaptive Stealthy Tampering framework that shifts the attack focus from agent internals to communication messages themselves—marking the first such approach. MAST integrates Monte Carlo Tree Search with direct preference optimization to train adaptive attack policies, and enforces dual constraints—semantic similarity and embedding-space proximity—to ensure high stealth and robust adaptability across multi-turn interactions. Contribution/Results: Extensive experiments across diverse tasks, agent architectures, and large language models demonstrate that MAST significantly improves both attack success rate and stealth. These findings underscore the critical need for—and identify key pathways toward—securing the communication layer in LLM-MAS.

Technology Category

Application Category

📝 Abstract

Large language model-based multi-agent systems (LLM-MAS) effectively accomplish complex and dynamic tasks through inter-agent communication, but this reliance introduces substantial safety vulnerabilities. Existing attack methods targeting LLM-MAS either compromise agent internals or rely on direct and overt persuasion, which limit their effectiveness, adaptability, and stealthiness. In this paper, we propose MAST, a Multi-round Adaptive Stealthy Tampering framework designed to exploit communication vulnerabilities within the system. MAST integrates Monte Carlo Tree Search with Direct Preference Optimization to train an attack policy model that adaptively generates effective multi-round tampering strategies. Furthermore, to preserve stealthiness, we impose dual semantic and embedding similarity constraints during the tampering process. Comprehensive experiments across diverse tasks, communication architectures, and LLMs demonstrate that MAST consistently achieves high attack success rates while significantly enhancing stealthiness compared to baselines. These findings highlight the effectiveness, stealthiness, and adaptability of MAST, underscoring the need for robust communication safeguards in LLM-MAS.

Problem

Research questions and friction points this paper is trying to address.

Exploiting communication vulnerabilities in LLM-MAS

Enhancing attack stealthiness with adaptive strategies

Addressing limitations of existing overt persuasion methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-round adaptive stealthy tampering framework

Monte Carlo Tree Search with Direct Preference Optimization

Dual semantic and embedding similarity constraints

🔎 Similar Papers

The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies

2024-07-28arXiv.orgCitations: 62

Authors to Follow