🤖 AI Summary
To address the limited performance of open-source large language models (LLMs) in register-transfer level (RTL) code generation—particularly their lack of automated error correction and capability to progressively generate increasingly complex designs—this paper proposes a multi-agent collaborative framework. The framework integrates domain-specific LLM agents with hardware simulation and synthesis tools (e.g., Verilator, Yosys) to establish an end-to-end closed loop encompassing RTL generation, compilation, functional verification, and synthesizability checking. We introduce Progressive Error Feedback Adaptation (PEFA), a novel iterative feedback mechanism that enables self-correction and gradual complexity escalation in generated RTL. Leveraging an open-source agent orchestration framework, we unify heterogeneous LLMs and evaluate the system on two NL-to-RTL benchmark datasets, achieving state-of-the-art pass rates while reducing average token consumption by 32%, thereby significantly narrowing the performance gap between open-source and proprietary LLMs.
📝 Abstract
We present an agentic flow consisting of multiple agents that combine specialized LLMs and hardware simulation tools to collaboratively complete the complex task of Register Transfer Level (RTL) generation without human intervention. A key feature of the proposed flow is the progressive error feedback system of agents (PEFA), a self-correcting mechanism that leverages iterative error feedback to progressively increase the complexity of the approach. The generated RTL includes checks for compilation, functional correctness, and synthesizable constructs. To validate this adaptive approach to code generation, benchmarking is performed using two opensource natural language-to-RTL datasets. We demonstrate the benefits of the proposed approach implemented on an open source agentic framework, using both open- and closed-source LLMs, effectively bridging the performance gap between them. Compared to previously published methods, our approach sets a new benchmark, providing state-of-the-art pass rates while being efficient in token counts.