DoomArena: A framework for Testing AI Agents Against Evolving Security Threats

📅 2025-04-18

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This paper addresses the challenge of evaluating AI agents under dynamic security threats. We propose DoomArena, the first modular security evaluation framework supporting plugin-based integration, fine-grained threat modeling, and explicit decoupling of attacks from execution environments. DoomArena is compatible with mainstream agent platforms—including BrowserGym and τ-bench—and enables customizable attack injection points, combinatorial multi-attack testing, and vulnerability–performance trade-off analysis. Its key innovation lies in the complete logical and executional decoupling of attack definitions from environmental implementations, ensuring cross-platform reusability. Empirical evaluation on state-of-the-art (SOTA) web-navigation and tool-use agents reveals three critical findings: (1) no agent achieves robustness across all threat models; (2) synergistic multi-attack compositions significantly amplify damage; and (3) defense strategies leveraging strong SOTA LLMs substantially outperform gatekeeper-style baseline models.

Technology Category

Application Category

📝 Abstract

We present DoomArena, a security evaluation framework for AI agents. DoomArena is designed on three principles: 1) It is a plug-in framework and integrates easily into realistic agentic frameworks like BrowserGym (for web agents) and $ au$-bench (for tool calling agents); 2) It is configurable and allows for detailed threat modeling, allowing configuration of specific components of the agentic framework being attackable, and specifying targets for the attacker; and 3) It is modular and decouples the development of attacks from details of the environment in which the agent is deployed, allowing for the same attacks to be applied across multiple environments. We illustrate several advantages of our framework, including the ability to adapt to new threat models and environments easily, the ability to easily combine several previously published attacks to enable comprehensive and fine-grained security testing, and the ability to analyze trade-offs between various vulnerabilities and performance. We apply DoomArena to state-of-the-art (SOTA) web and tool-calling agents and find a number of surprising results: 1) SOTA agents have varying levels of vulnerability to different threat models (malicious user vs malicious environment), and there is no Pareto dominant agent across all threat models; 2) When multiple attacks are applied to an agent, they often combine constructively; 3) Guardrail model-based defenses seem to fail, while defenses based on powerful SOTA LLMs work better. DoomArena is available at https://github.com/ServiceNow/DoomArena.

Problem

Research questions and friction points this paper is trying to address.

Testing AI agents against evolving security threats

Configurable threat modeling for agentic frameworks

Modular attack development across multiple environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Plug-in framework for easy integration

Configurable threat modeling for detailed attacks

Modular design decouples attacks from environments

🔎 Similar Papers

No similar papers found.