AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing mitigation approaches for security risks—such as vulnerabilities, illegal actions, and harmful behaviors—arising from autonomous LLM-agent execution suffer from insufficient robustness, interpretability, and adaptability. This paper introduces AgentSpec, the first declarative, modular, and interpretable runtime safety-constraint framework tailored for LLM agents. It employs a lightweight domain-specific language (DSL) to uniformly model trigger conditions, predicate evaluation, and enforcement mechanisms. AgentSpec supports LLM-assisted rule generation, cross-domain generalization, and millisecond-level, low-overhead constraint enforcement across diverse domains—including code sandboxing, embodied robotics, and autonomous driving. Experiments demonstrate that AgentSpec achieves >90% unsafe-execution interception for code agents, 100% elimination of hazardous actions in embodied tasks, 100% compliance with traffic regulations in autonomous driving, and LLM-generated rule accuracy of 95.56% and recall of 70.96%.

Technology Category

Application Category

📝 Abstract
Agents built on LLMs are increasingly deployed across diverse domains, automating complex decision-making and task execution. However, their autonomy introduces safety risks, including security vulnerabilities, legal violations, and unintended harmful actions. Existing mitigation methods, such as model-based safeguards and early enforcement strategies, fall short in robustness, interpretability, and adaptability. To address these challenges, we propose AgentSpec, a lightweight domain-specific language for specifying and enforcing runtime constraints on LLM agents. With AgentSpec, users define structured rules that incorporate triggers, predicates, and enforcement mechanisms, ensuring agents operate within predefined safety boundaries. We implement AgentSpec across multiple domains, including code execution, embodied agents, and autonomous driving, demonstrating its adaptability and effectiveness. Our evaluation shows that AgentSpec successfully prevents unsafe executions in over 90% of code agent cases, eliminates all hazardous actions in embodied agent tasks, and enforces 100% compliance by autonomous vehicles (AVs). Despite its strong safety guarantees, AgentSpec remains computationally lightweight, with overheads in milliseconds. By combining interpretability, modularity, and efficiency, AgentSpec provides a practical and scalable solution for enforcing LLM agent safety across diverse applications. We also automate the generation of rules using LLMs and assess their effectiveness. Our evaluation shows that the rules generated by OpenAI o1 achieve a precision of 95.56% and recall of 70.96% for embodied agents, successfully identifying 87.26% of the risky code, and prevent AVs from breaking laws in 5 out of 8 scenarios.
Problem

Research questions and friction points this paper is trying to address.

Addressing safety risks in autonomous LLM agents
Overcoming limitations of existing enforcement methods
Ensuring compliance across diverse application domains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight DSL for runtime LLM agent constraints
Structured rules with triggers and enforcement mechanisms
Automated rule generation using LLMs for safety
🔎 Similar Papers
No similar papers found.
H
Haoyu Wang
School of Computing and Information System, Singapore Management University, Singapore
Christopher M. Poskitt
Christopher M. Poskitt
Singapore Management University (SMU)
software engineeringsoftware testingformal methodscybersecuritygraph transformation
J
Jun Sun
School of Computing and Information System, Singapore Management University, Singapore