🤖 AI Summary
This work addresses the challenges of functional correctness and security vulnerabilities in smart contract generation, where existing large language models fall short of the high reliability required. We propose SolAgent, a tool-augmented multi-agent framework that emulates human expert workflows through a novel dual-loop refinement mechanism: an outer loop orchestrates multi-agent collaboration, while an inner loop integrates Forge compilation verification and Slither static analysis to ensure functional correctness and eliminate security flaws, all while supporting complex project dependency resolution. The high-quality generation trajectories produced by SolAgent can be distilled into open-source small models. On the SolEval+ benchmark, SolAgent achieves a 64.39% Pass@1 rate, significantly outperforming current LLMs, AI-powered IDEs, and agent-based approaches, and reduces security vulnerabilities by 39.77% compared to human-written contracts.
📝 Abstract
Smart contracts are the backbone of the decentralized web, yet ensuring their functional correctness and security remains a critical challenge. While Large Language Models (LLMs) have shown promise in code generation, they often struggle with the rigorous requirements of smart contracts, frequently producing code that is buggy or vulnerable. To address this, we propose SolAgent, a novel tool-augmented multi-agent framework that mimics the workflow of human experts. SolAgent integrates a \textbf{dual-loop refinement mechanism}: an inner loop using the \textit{Forge} compiler to ensure functional correctness, and an outer loop leveraging the \textit{Slither} static analyzer to eliminate security vulnerabilities. Additionally, the agent is equipped with file system capabilities to resolve complex project dependencies. Experiments on the SolEval+ Benchmark, a rigorous suite derived from high-quality real-world projects, demonstrate that SolAgent achieves a Pass@1 rate of up to \textbf{64.39\%}, significantly outperforming state-of-the-art LLMs ($\sim$25\%), AI IDEs (e.g., GitHub Copilot), and existing agent frameworks. Moreover, it reduces security vulnerabilities by up to \textbf{39.77\%} compared to human-written baselines. Finally, we demonstrate that the high-quality trajectories generated by SolAgent can be used to distill smaller, open-source models, democratizing access to secure smart contract generation. We release our data and code at https://github.com/openpaperz/SolAgent.