🤖 AI Summary
This paper addresses the limitation of conventional line-based diff tools in accurately capturing structural changes in Solidity smart contracts. To this end, it proposes the first abstract syntax tree (AST)-based structured differencing method tailored for Solidity. The approach introduces a Solidity-specific AST differencing algorithm that integrates syntactic node normalization, structure-aware matching, and optimized tree edit distance computation to generate sound edit scripts. Evaluated on 353,000 contract pairs, the method achieves a 96.1% successful diff rate and significantly reduces edit script length. It consistently outperforms Git’s line-level diff on 925 real-world developer commits. The core contribution lies in establishing the first semantics-aware, structurally precise AST-level differencing model for Solidity smart contracts—enabling rigorous contract evolution analysis and security auditing.
📝 Abstract
Structured code differencing is the act of comparing the hierarchical structure of code via its abstract syntax tree (AST) to capture modifications. AST-based source code differencing enables tasks such as vulnerability detection and automated repair where traditional line-based differencing falls short. We introduce SoliDiffy, the first AST differencing tool for Solidity smart contracts with the ability to generate an edit script that soundly shows the structural differences between two smart-contracts using insert, delete, update, move operations. In our evaluation on 353,262 contract pairs, SoliDiffy achieved a 96.1% diffing success rate, surpassing the state-of-the-art, and produced significantly shorter edit scripts. Additional experiments on 925 real-world commits further confirmed its superiority compared to Git line-based differencing. SoliDiffy provides accurate representations of smart contract evolution even in the existence of multiple complex modifications to the source code. SoliDiffy is made publicly available at https://github.com/mojtaba-eshghie/SoliDiffy.