🤖 AI Summary
AI agent systems lack end-to-end security assurance, exposing critical vulnerabilities beyond isolated model-level defenses.
Method: This work transfers foundational principles from traditional computer system security to agentic computing, integrating systematic security analysis, attack surface modeling, and threat assessment. Leveraging 11 real-world attack cases, we construct an empirically grounded attacker model to identify agent-specific security challenges.
Contribution/Results: We present the first systematic formalization of end-to-end security properties for AI agents—extending beyond single-model hardening. We propose a structured security principles framework tailored specifically for agentic computing, encompassing confidentiality, integrity, availability, accountability, and goal alignment across the full agent lifecycle. Our analysis uncovers 11 distinct security research challenges inherent to agent architectures, including tool-mediated privilege escalation, dynamic plan injection, and emergent delegation risks. This work establishes foundational concepts and actionable design guidelines for system-level AI security, enabling principled, deployable defenses in production agentic systems.
📝 Abstract
This paper articulates short- and long-term research problems in AI agent security and privacy, using the lens of computer systems security. This approach examines end-to-end security properties of entire systems, rather than AI models in isolation. While we recognize that hardening a single model is useful, it is important to realize that it is often insufficient. By way of an analogy, creating a model that is always helpful and harmless is akin to creating software that is always helpful and harmless. The collective experience of decades of cybersecurity research and practice shows that this is insufficient. Rather, constructing an informed and realistic attacker model before building a system, applying hard-earned lessons from software security, and continuous improvement of security posture is a tried-and-tested approach to securing real computer systems. A key goal is to examine where research challenges arise when applying traditional security principles in the context of AI agents. A secondary goal of this report is to distill these ideas for AI and ML practitioners and researchers. We discuss the challenges of applying security principles to agentic computing, present 11 case studies of real attacks on agentic systems, and define a series of new research problems specific to the security of agentic systems.