DecoRTL: A Run-time Decoding Framework for RTL Code Generation with LLMs

📅 2025-07-02

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Large language models (LLMs) exhibit frequent syntax errors, semantic hallucinations, and redundant invalid outputs when generating register-transfer level (RTL) code, primarily due to their reliance on natural-language decoding strategies ill-suited for hardware description languages. To address this, we propose a lightweight, fine-tuning-free runtime decoding framework. Our method introduces three key innovations: (1) token-level entropy analysis—the first such approach—to pinpoint uncertainty sources in RTL generation; (2) a syntax-aware, contrastive decoding mechanism integrating self-consistency sampling, syntax-role-driven temperature adaptation, and multi-candidate generation with re-ranking; and (3) dynamic differentiation between syntax-critical regions and design-exploration regions. Evaluated on the VerilogEval benchmark, our framework achieves substantial improvements in syntactic validity (+18.3%), functional correctness (+15.7%), and output diversity (+22.1%), with negligible inference overhead.

Technology Category

Application Category

📝 Abstract

As one of their many applications, large language models (LLMs) have recently shown promise in automating register transfer level (RTL) code generation. However, conventional LLM decoding strategies, originally designed for natural language, often fail to meet the structural and semantic demands of RTL, leading to hallucinated, repetitive, or invalid code outputs. In this paper, we first investigate the root causes of these decoding failures through an empirical analysis of token-level entropy during RTL generation. Our findings reveal that LLMs exhibit low confidence in regions of structural ambiguity or semantic complexity, showing that standard decoding strategies fail to differentiate between regions requiring determinism (syntax-critical regions) and those that benefit from creative exploratory variability (design-critical regions). Then, to overcome this, we introduce DecoRTL, a novel run-time decoding strategy, that is both syntax-aware and contrastive for RTL code generation. DecoRTL integrates two complementary components: (i) self-consistency sampling, which generates multiple candidates and re-ranks them based on token-level agreement to promote correctness while maintaining diversity; and (ii) syntax-aware temperature adaptation, which classifies tokens by their syntactical and functional roles and adjusts the sampling temperature accordingly, enforcing low temperature for syntax-critical tokens and higher temperature for exploratory ones. Our approach operates entirely at inference time without requiring any additional model fine-tuning. Through evaluations on multiple open-source LLMs using the VerilogEval benchmark, we demonstrate significant improvements in syntactic validity, functional correctness, and output diversity, while the execution overhead (performance overhead) is imperceptible.

Problem

Research questions and friction points this paper is trying to address.

LLMs fail to meet RTL structural and semantic demands

Standard decoding lacks differentiation for syntax vs design regions

Need for syntax-aware contrastive decoding in RTL generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-consistency sampling for diverse correct candidates

Syntax-aware temperature adaptation for token roles

Run-time decoding without model fine-tuning

🔎 Similar Papers

No similar papers found.