๐ค AI Summary
This work addresses the inefficiency of deploying financial derivative contracts on blockchains, the error-proneness of manual Solidity coding, and high gas costs. We propose a two-stage curriculum learningโbased reinforcement learning framework that enables end-to-end generation of functionally correct and gas-optimized Solidity smart contracts from domain-specific Contract Description Model (CDM) specifications. In the first stage, the model is trained for functional correctness; in the second, it is fine-tuned using gas consumption as the reward signal. Proximal Policy Optimization (PPO) drives sequence generation, while a curated library of safety-critical code snippets enhances reliability. On unseen test cases, our generated contracts reduce average gas cost by 35.59% compared to baselines, while formal verification and multi-scenario testing ensure functional correctness. To our knowledge, this is the first approach integrating curriculum learning with gas-aware reinforcement learning, significantly improving both the practicality and economic viability of automated generation for complex financial smart contracts.
๐ Abstract
Smart contract-based automation of financial derivatives offers substantial efficiency gains, but its real-world adoption is constrained by the complexity of translating financial specifications into gas-efficient executable code. In particular, generating code that is both functionally correct and economically viable from high-level specifications, such as the Common Domain Model (CDM), remains a significant challenge. This paper introduces a Reinforcement Learning (RL) framework to generate functional and gas-optimized Solidity smart contracts directly from CDM specifications. We employ a Proximal Policy Optimization (PPO) agent that learns to select optimal code snippets from a pre-defined library. To manage the complex search space, a two-phase curriculum first trains the agent for functional correctness before shifting its focus to gas optimization. Our empirical results show the RL agent learns to generate contracts with significant gas savings, achieving cost reductions of up to 35.59% on unseen test data compared to unoptimized baselines. This work presents a viable methodology for the automated synthesis of reliable and economically sustainable smart contracts, bridging the gap between high-level financial agreements and efficient on-chain execution.