🤖 AI Summary
This work addresses the challenge that existing large language models often fail to generate syntactically and functionally correct register-transfer level (RTL) code for complex digital circuits. To overcome this limitation without fine-tuning pretrained models, the authors propose MeltRTL, a framework that leverages a dynamically routed mixture-of-experts attention mechanism combined with a nonlinear probe-based intervention strategy applied during inference. This approach enables efficient, targeted modifications to specific attention heads, significantly improving generation quality. Evaluated on the VerilogEval benchmark, MeltRTL achieves a 96% synthesizability rate and 60% functional correctness, representing improvements of 10.7% and 14.7% over the baseline, respectively, while incurring only a 27% increase in computational overhead.
📝 Abstract
The automated generation of hardware register-transfer level (RTL) code with large language models (LLMs) shows promise, yet current solutions struggle to produce syntactically and functionally correct code for complex digital designs. This paper introduces MeltRTL, a novel framework that integrates multi-expert attention with inference-time intervention (ITI) to significantly improve LLM-based RTL code generation accuracy without retraining the base model. MeltRTL introduces three key innovations: (1) A multi-expert attention architecture that dynamically routes design specifications to specialized expert networks, enabling targeted reasoning across various hardware categories; (2) An inference-time intervention mechanism that employs non-linear probes to detect and correct hardware-specific inaccuracies during generation; and (3) An efficient intervention framework that selectively operates on expert-specific attention heads with minimal computational overhead. We evaluate MeltRTL on the VerilogEval benchmark, achieving 96% synthesizability and 60% functional correctness, compared to the base LLM's 85.3% and 45.3%, respectively. These improvements are obtained entirely at inference time, with only 27% computational overhead and no model fine-tuning, making MeltRTL immediately deployable on existing pre-trained LLMs. Ablation studies further show the complementary benefits of multi-expert architecture and ITI, highlighting their synergistic effects when combined.