MeltRTL: Multi-Expert LLMs with Inference-time Intervention for RTL Code Generation

📅 2026-01-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that existing large language models often fail to generate syntactically and functionally correct register-transfer level (RTL) code for complex digital circuits. To overcome this limitation without fine-tuning pretrained models, the authors propose MeltRTL, a framework that leverages a dynamically routed mixture-of-experts attention mechanism combined with a nonlinear probe-based intervention strategy applied during inference. This approach enables efficient, targeted modifications to specific attention heads, significantly improving generation quality. Evaluated on the VerilogEval benchmark, MeltRTL achieves a 96% synthesizability rate and 60% functional correctness, representing improvements of 10.7% and 14.7% over the baseline, respectively, while incurring only a 27% increase in computational overhead.

Technology Category

Application Category

📝 Abstract
The automated generation of hardware register-transfer level (RTL) code with large language models (LLMs) shows promise, yet current solutions struggle to produce syntactically and functionally correct code for complex digital designs. This paper introduces MeltRTL, a novel framework that integrates multi-expert attention with inference-time intervention (ITI) to significantly improve LLM-based RTL code generation accuracy without retraining the base model. MeltRTL introduces three key innovations: (1) A multi-expert attention architecture that dynamically routes design specifications to specialized expert networks, enabling targeted reasoning across various hardware categories; (2) An inference-time intervention mechanism that employs non-linear probes to detect and correct hardware-specific inaccuracies during generation; and (3) An efficient intervention framework that selectively operates on expert-specific attention heads with minimal computational overhead. We evaluate MeltRTL on the VerilogEval benchmark, achieving 96% synthesizability and 60% functional correctness, compared to the base LLM's 85.3% and 45.3%, respectively. These improvements are obtained entirely at inference time, with only 27% computational overhead and no model fine-tuning, making MeltRTL immediately deployable on existing pre-trained LLMs. Ablation studies further show the complementary benefits of multi-expert architecture and ITI, highlighting their synergistic effects when combined.
Problem

Research questions and friction points this paper is trying to address.

RTL code generation
large language models
functional correctness
syntactic correctness
hardware design
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-expert Attention
Inference-time Intervention
RTL Code Generation
Hardware-aware LLMs
Verilog Synthesizability
🔎 Similar Papers
No similar papers found.