Multi-Agent Systems Execute Arbitrary Malicious Code

📅 2025-03-15

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This paper identifies a critical control-flow hijacking vulnerability in multi-agent systems when processing untrusted inputs (e.g., malicious web pages or email attachments): even if individual agents are robust against prompt injection and refuse harmful instructions, their inter-agent coordination mechanisms can be subverted to achieve arbitrary code execution and exfiltration of sensitive data within the containerized environment. Leveraging red-teaming on mainstream frameworks—including AutoGen and CrewAI—the authors conduct dynamic inter-agent communication monitoring and sandbox escape analysis to realize, for the first time in realistic settings, an end-to-end attack chain. Key contributions are: (1) identification and formalization of “control-flow hijacking” as a novel system-level attack surface; (2) empirical demonstration that agent-level security guarantees do not ensure system-level security; and (3) proposal of the first security evaluation paradigm specifically designed for multi-agent systems.

Technology Category

Application Category

📝 Abstract

Multi-agent systems coordinate LLM-based agents to perform tasks on users' behalf. In real-world applications, multi-agent systems will inevitably interact with untrusted inputs, such as malicious Web content, files, email attachments, etc. Using several recently proposed multi-agent frameworks as concrete examples, we demonstrate that adversarial content can hijack control and communication within the system to invoke unsafe agents and functionalities. This results in a complete security breach, up to execution of arbitrary malicious code on the user's device and/or exfiltration of sensitive data from the user's containerized environment. We show that control-flow hijacking attacks succeed even if the individual agents are not susceptible to direct or indirect prompt injection, and even if they refuse to perform harmful actions.

Problem

Research questions and friction points this paper is trying to address.

Multi-agent systems vulnerable to malicious inputs

Adversarial content hijacks control and communication

Execution of arbitrary malicious code on devices

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent systems coordinate LLM-based agents

Adversarial content hijacks control and communication

Control-flow hijacking bypasses agent safeguards

🔎 Similar Papers

Large Model Based Agents: State-of-the-Art, Cooperation Paradigms, Security and Privacy, and Future Trends