AI Agent Smart Contract Exploit Generation

📅 2025-07-07

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

This work addresses the challenge of automated exploit generation for smart contract vulnerabilities. We propose an AI agent framework that eliminates hand-crafted heuristics by tightly integrating large language models with six domain-specific tools—including on-chain state testing and Monte Carlo risk analysis—to enable end-to-end vulnerability discovery, exploit synthesis, and execution validation. The framework employs execution feedback to drive dynamic policy evolution. Crucially, it provides the first quantitative evidence of structural asymmetry in AI-augmented adversarial outcomes: exploitation inherently yields higher returns than defense. Evaluated on the VERITE benchmark, our approach achieves a 62.96% attack success rate, discovers nine previously unknown vulnerabilities, extracts up to $8.59M in a single transaction, accumulates $9.33M in total proceeds, and incurs as little as $0.01 per exploit attempt.

Technology Category

Application Category

📝 Abstract

We present A1, an agentic execution driven system that transforms any LLM into an end-to-end exploit generator. A1 has no hand-crafted heuristics and provides the agent with six domain-specific tools that enable autonomous vulnerability discovery. The agent can flexibly leverage these tools to understand smart contract behavior, generate exploit strategies, test them on blockchain states, and refine approaches based on execution feedback. All outputs are concretely validated to eliminate false positives. The evaluation across 36 real-world vulnerable contracts on Ethereum and Binance Smart Chain demonstrates a 62.96% (17 out of 27) success rate on the VERITE benchmark. Beyond the VERITE dataset, A1 identified 9 additional vulnerable contracts, with 5 cases occurring after the strongest model's training cutoff date. Across all 26 successful cases, A1 extracts up to 8.59 million USD per case and 9.33 million USD total. Through 432 experiments across six LLMs, we analyze iteration-wise performance showing diminishing returns with average marginal gains of +9.7%, +3.7%, +5.1%, and +2.8% for iterations 2-5 respectively, with per-experiment costs ranging $0.01-$3.59. A Monte Carlo analysis of 19 historical attacks shows success probabilities of 85.9%-88.8% without detection delays. We investigate whether an attacker or a defender benefits most from deploying A1 as a continuous on-chain scanning system. Our model shows that OpenAI's o3-pro maintains profitability up to a 30.0 days scanning delay at 0.100% vulnerability incidence rates, while faster models require >=1.000% rates to break-even. The findings exposes a troubling asymmetry: at 0.1% vulnerability rates, attackers achieve an on-chain scanning profitability at a $6000 exploit value, while defenders require $60000, raising fundamental questions about whether AI agents inevitably favor exploitation over defense.

Problem

Research questions and friction points this paper is trying to address.

Generates exploits for smart contracts using AI agents

Autonomously discovers vulnerabilities in blockchain contracts

Analyzes profitability of attacks versus defense in blockchain

Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic execution transforms LLMs into exploit generators

Autonomous vulnerability discovery with six domain-specific tools

Concrete validation eliminates false positives in outputs

🔎 Similar Papers

No similar papers found.