🤖 AI Summary
This work addresses the limitations of existing large language models in Verilog code generation, which often rely heavily on proprietary models or external verification tools, resulting in high costs, privacy concerns, and insufficient functional correctness. To overcome these challenges, the authors propose a unified multi-agent framework that leverages a testbench-driven verification mechanism to automatically generate inference-oriented training data. Combined with a test-time expansion strategy, this approach enables iterative generation, verification, and debugging of RTL designs entirely without external tools. Through local fine-tuning alone, the method significantly improves functional correctness. Evaluated on the VerilogEval-v2, RTLLM-v2, and CVDP benchmarks, it surpasses the current state-of-the-art model, QiMeng-CodeV-R1, while using fewer training resources.
📝 Abstract
Large language models (LLMs) have recently emerged as a promising approach for automating Verilog code generation; however, existing methods primarily emphasize syntactic correctness and often rely on commercial models or external verification tools, which introduces concerns regarding cost, data privacy, and limited guarantees of functional correctness. This work proposes a unified multi-agent framework for reasoning-oriented training data generation with integrated testbench-driven verification, enabling locally fine-tuned LLMs, SiliconMind-V1, to iteratively generate, test, and debug Register-Transfer Level (RTL) designs through test-time scaling. Experimental results on representative benchmarks (VerilogEval-v2, RTLLM-v2, and CVDP) demonstrate that the proposed approach outperforms the state-of-the-art QiMeng-CodeV-R1 in functional correctness while using fewer training resources.