🤖 AI Summary
Frequent Solidity version updates cause widespread cross-version compilation errors, severely hindering smart contract migration and maintenance. To address this, we propose an automated repair framework that synergistically integrates large language models (LLMs) with domain-specific knowledge. Our approach innovatively combines context-aware code slicing, Ethereum official documentation–driven knowledge retrieval, and iterative prompt engineering to overcome the limitations of general-purpose LLMs in semantic-level error correction. The framework implements a multi-stage, domain-customized repair pipeline that significantly enhances version compatibility. Evaluated on a real-world smart contract dataset, it achieves a 96.97% repair accuracy—outperforming the GPT-4o baseline by 24.24%. This work provides an efficient, reliable technical foundation for evolving smart contracts across Solidity versions.
📝 Abstract
Solidity, the dominant smart contract language for Ethereum, has rapidly evolved with frequent version updates to enhance security, functionality, and developer experience. However, these continual changes introduce significant challenges, particularly in compilation errors, code migration, and maintenance. Therefore, we conduct an empirical study to investigate the challenges in the Solidity version evolution and reveal that 81.68% of examined contracts encounter errors when compiled across different versions, with 86.92% of compilation errors.
To mitigate these challenges, we conducted a systematic evaluation of large language models (LLMs) for resolving Solidity compilation errors during version migrations. Our empirical analysis across both open-source (LLaMA3, DeepSeek) and closed-source (GPT-4o, GPT-3.5-turbo) LLMs reveals that although these models exhibit error repair capabilities, their effectiveness diminishes significantly for semantic-level issues and shows strong dependency on prompt engineering strategies. This underscores the critical need for domain-specific adaptation in developing reliable LLM-based repair systems for smart contracts.
Building upon these insights, we introduce SMCFIXER, a novel framework that systematically integrates expert knowledge retrieval with LLM-based repair mechanisms for Solidity compilation error resolution. The architecture comprises three core phases: (1) context-aware code slicing that extracts relevant error information; (2) expert knowledge retrieval from official documentation; and (3) iterative patch generation for Solidity migration. Experimental validation across Solidity version migrations demonstrates our approach's statistically significant 24.24% improvement over baseline GPT-4o on real-world datasets, achieving near-perfect 96.97% accuracy.