🤖 AI Summary
This study addresses emergent security risks in cross-domain multi-agent large language model (LLM) systems—particularly confidentiality breaches across organizations and policy violations—arising from agent-to-agent interactions, which are inadequately captured by conventional software-vulnerability-centric paradigms.
Method: We propose a dynamic evaluation framework integrating security modeling, threat tree analysis, adversarial scenario simulation, and formal policy verification, enabling systematic identification and classification of domain-specific threats.
Contribution/Results: We introduce the first taxonomy of seven distinct security challenges unique to cross-domain multi-agent LLMs; establish a quantifiable attack model, standardized evaluation metrics, and a research roadmap; and provide verifiable safety requirements for critical techniques including alignment, sandboxing, and access control. This work bridges a foundational gap in multi-agent LLM security theory, offering a rigorous, actionable foundation for secure system design and governance.
📝 Abstract
Large language models (LLMs) are rapidly evolving into autonomous agents that cooperate across organizational boundaries, enabling joint disaster response, supply-chain optimization, and other tasks that demand decentralized expertise without surrendering data ownership. Yet, cross-domain collaboration shatters the unified trust assumptions behind current alignment and containment techniques. An agent benign in isolation may, when receiving messages from an untrusted peer, leak secrets or violate policy, producing risks driven by emergent multi-agent dynamics rather than classical software bugs. This position paper maps the security agenda for cross-domain multi-agent LLM systems. We introduce seven categories of novel security challenges, for each of which we also present plausible attacks, security evaluation metrics, and future research guidelines.