An end-to-end agentic pipeline for smart contract translation and quality evaluation

📅 2026-02-14

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the lack of systematic evaluation of large language models (LLMs) in generating smart contracts from natural language, which hinders reliable assessment of their functional correctness, security, and logical consistency. To bridge this gap, we propose an end-to-end multi-agent evaluation framework built on CrewAI that enables collaborative optimization across natural language parsing, structured modeling, and Solidity code generation. The framework introduces a five-dimensional quality metric—encompassing functional completeness, variable fidelity, state machine correctness, among others—and incorporates traceable metadata. Through integrated compilation validation, static security analysis, and pairwise comparison against real-world contracts, our approach reproducibly identifies logical omissions and state transition errors, establishing the first systematic benchmark for automated smart contract generation.

Technology Category

Application Category

📝 Abstract

We present an end-to-end framework for systematic evaluation of LLM-generated smart contracts from natural-language specifications. The system parses contractual text into structured schemas, generates Solidity code, and performs automated quality assessment through compilation and security checks. Using CrewAI-style agent teams with iterative refinement, the pipeline produces structured artifacts with full provenance metadata. Quality is measured across five dimensions, including functional completeness, variable fidelity, state-machine correctness, business-logic fidelity, and code quality aggregated into composite scores. The framework supports paired evaluation against ground-truth implementations, quantifying alignment and identifying systematic error modes such as logic omissions and state transition inconsistencies. This provides a reproducible benchmark for empirical research on smart contract synthesis quality and supports extensions to formal verification and compliance checking.

Problem

Research questions and friction points this paper is trying to address.

smart contract

quality evaluation

LLM-generated code

natural-language specification

contract synthesis

Innovation

Methods, ideas, or system contributions that make the work stand out.

agentic pipeline

smart contract synthesis

LLM evaluation