Agent Security is a Systems Problem

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

247K/year
🤖 AI Summary
Current approaches to AI agent safety rely excessively on model robustness while lacking system-level guarantees. This work proposes treating AI models as untrusted components and, for the first time, systematically integrates established principles from operating systems, networking, and formal methods to construct a multi-layered defense framework that enforces safety invariants. By synergistically combining system security mechanisms, adversarial machine learning defenses, and formal verification techniques, this paradigm offers predictable and rigorous safety assurances. The effectiveness of the proposed approach is validated through an analysis of 11 real-world agent attack cases, which also reveals key research challenges in achieving comprehensive system-level AI safety.
📝 Abstract
We take the position that agent security must be approached as a systems problem: the AI model powering the agent must be treated as an untrusted component, and security invariants must be enforced at the system level. Through this lens, efforts to increase model robustness (the dominant viewpoint in the community) are insufficient on their own. Instead, we must complement existing efforts with techniques from the systems security domain. Based on our experience as cybersecurity researchers in operating systems, networks, formal methods, and adversarial machine learning, we articulate a set of core principles, grounded in decades of systems security research, that provide a foundation for designing agentic systems with predictable guarantees. As evidence, we analyze eleven representative real-world attacks on agents and discuss how systems principles, if realized, could have prevented these attacks. We also identify the research challenges that stand in the way of implementing these principles in agents.
Problem

Research questions and friction points this paper is trying to address.

agent security
systems problem
security invariants
model robustness
agentic systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

agent security
systems security
untrusted AI model
security invariants
adversarial robustness
Mihai Christodorescu
Mihai Christodorescu
Google
Computer SecurityProgramming LanguagesFormal Methods
Earlence Fernandes
Earlence Fernandes
Assistant Professor, UC San Diego
Computer SecurityComputer Systems
A
Ashish Hooda
Google
Somesh Jha
Somesh Jha
Lubar Chair of Computer Science, University of Wisconsin
Trustworthy Machine LearningSecurityFormal methodsProgramming Languages
J
Johann Rehberger
EmbraceTheRed
Kamalika Chaudhuri
Kamalika Chaudhuri
FAIR @ Meta
Trustworthy AI
X
Xiaohan Fu
University of California San Diego
K
Khawaja Shams
Google
Guy Amir
Guy Amir
Cornell University
AI SafetyNeural Network VerificationAI ExplainabilityFormal MethodsRobotics
J
Jihye Choi
University of Wisconsin–Madison
S
Sarthak Choudhary
University of Wisconsin–Madison
Nils Palumbo
Nils Palumbo
PhD Student in Computer Science, UW-Madison
A
Andrey Labunets
University of California San Diego
N
Nishit V. Pandya
University of California San Diego