BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems

📅 2025-05-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
研究通过BountyBench框架评估AI代理在网络安全中的攻防能力,使用25个真实系统,定义检测、利用和修补漏洞任务,评估多种AI代理表现。

Technology Category

Application Category

📝 Abstract
AI agents have the potential to significantly alter the cybersecurity landscape. To help us understand this change, we introduce the first framework to capture offensive and defensive cyber-capabilities in evolving real-world systems. Instantiating this framework with BountyBench, we set up 25 systems with complex, real-world codebases. To capture the vulnerability lifecycle, we define three task types: Detect (detecting a new vulnerability), Exploit (exploiting a specific vulnerability), and Patch (patching a specific vulnerability). For Detect, we construct a new success indicator, which is general across vulnerability types and provides localized evaluation. We manually set up the environment for each system, including installing packages, setting up server(s), and hydrating database(s). We add 40 bug bounties, which are vulnerabilities with monetary awards from $10 to $30,485, and cover 9 of the OWASP Top 10 Risks. To modulate task difficulty, we devise a new strategy based on information to guide detection, interpolating from identifying a zero day to exploiting a specific vulnerability. We evaluate 5 agents: Claude Code, OpenAI Codex CLI, and custom agents with GPT-4.1, Gemini 2.5 Pro Preview, and Claude 3.7 Sonnet Thinking. Given up to three attempts, the top-performing agents are Claude Code (5% on Detect, mapping to $1,350), Custom Agent with Claude 3.7 Sonnet Thinking (5% on Detect, mapping to $1,025; 67.5% on Exploit), and OpenAI Codex CLI (5% on Detect, mapping to $2,400; 90% on Patch, mapping to $14,422). OpenAI Codex CLI and Claude Code are more capable at defense, achieving higher Patch scores of 90% and 87.5%, compared to Exploit scores of 32.5% and 57.5% respectively; in contrast, the custom agents are relatively balanced between offense and defense, achieving Exploit scores of 40-67.5% and Patch scores of 45-60%.
Problem

Research questions and friction points this paper is trying to address.

Measuring AI agents' impact on real-world cybersecurity systems
Evaluating offensive and defensive capabilities in vulnerability lifecycle
Assessing AI performance in detecting, exploiting, and patching vulnerabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Framework for offensive and defensive cyber-capabilities evaluation
BountyBench with 25 real-world systems and 40 bounties
New strategy modulating task difficulty via information guidance
🔎 Similar Papers
No similar papers found.
A
Andy K. Zhang
Stanford University
J
Joey Ji
Stanford University
C
Celeste Menders
Stanford University
R
Riya Dulepet
Stanford University
T
Thomas Qin
Stanford University
R
Ron Y. Wang
Stanford University
J
Junrong Wu
Stanford University
K
Kyleen Liao
Stanford University
J
Jiliang Li
Stanford University
J
Jinghan Hu
Stanford University
S
Sara Hong
Stanford University
N
Nardos Demilew
Stanford University
S
Shivatmica Murgai
Stanford University
J
Jason Tran
Stanford University
N
Nishka Kacheria
Stanford University
E
Ethan Ho
Stanford University
D
Denis Liu
Stanford University
L
Lauren McLane
Stanford University
O
Olivia Bruvik
Stanford University
D
Dai-Rong Han
Stanford University
S
Seungwoo Kim
Stanford University
A
Akhil Vyas
Stanford University
C
Cuiyuanxiu Chen
Stanford University
R
Ryan Li
Stanford University
Weiran Xu
Weiran Xu
Associate professor of natural language processing, Beijing University of Posts and Telecommunications
natural language processing
J
Jonathan Z. Ye
Stanford University
P
Prerit Choudhary
Stanford University
S
Siddharth M. Bhatia
Stanford University
V
Vikram Sivashankar
Stanford University
Y
Yuxuan Bao
Stanford University
Dawn Song
Dawn Song
Professor of Computer Science, UC Berkeley
Computer Security and Privacy
Dan Boneh
Dan Boneh
Professor of Computer Science, Stanford University
CryptographyComputer SecurityComputer Science Theory
Daniel E. Ho
Daniel E. Ho
Stanford University
Regulatory policyartificial intelligenceadministrative lawantidiscrimination
Percy Liang
Percy Liang
Associate Professor of Computer Science, Stanford University
machine learningnatural language processing