AutoVeriFix+: High-Correctness RTL Generation via Trace-Aware Causal Fix and Semantic Redundancy Pruning

📅 2026-03-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

190K/year
🤖 AI Summary
This work addresses the challenge that large language models (LLMs), due to scarce training data, often produce functionally incorrect Verilog code despite syntactic correctness—insufficient for hardware design requirements. To this end, the authors propose a three-stage framework: first, an LLM generates a high-level behavioral model in Python; second, it produces and iteratively refines candidate Verilog RTL implementations; and third, a concolic testing engine leverages cycle-accurate simulation traces and register snapshots to pinpoint state transition errors, augmented by a novel causal context-aware semantic redundancy pruning mechanism. Evaluated on the VerilogEval-machine benchmark, the approach achieves a 90.2% pass@10 functional correctness rate while reducing redundant logic by 25% on average, effectively balancing functional accuracy with area optimization.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have demonstrated impressive capabilities in generating software code for high-level programming languages such as Python and C++. However, their application to hardware description languages, such as Verilog, is challenging due to the scarcity of high-quality training data. Current approaches to Verilog code generation using LLMs often focus on syntactic correctness, resulting in code with functional errors. To address these challenges, we propose AutoVeriFix+, a novel three-stage framework that integrates high-level semantic reasoning with state-space exploration to enhance functional correctness and design efficiency. In the first stage, an LLM is employed to generate high-level Python reference models that define the intended circuit behavior. In the second stage, another LLM generates initial Verilog RTL candidates and iteratively fixes syntactic errors. In the third stage, we introduce a Concolic testing engine to exercise deep sequential logic and identify corner-case vulnerabilities. With cycle-accurate execution traces and internal register snapshots, AutoVeriFix+ provides the LLM with the causal context necessary to resolve complex state-transition errors. Furthermore, it will generate a coverage report to identify functionally redundant branches, enabling the LLM to perform semantic pruning for area optimization. Experimental results demonstrate that AutoVeriFix+ achieves over 80% functional correctness on rigorous benchmarks, reaching a pass@10 score of 90.2% on the VerilogEval-machine dataset. In addition, it eliminates an average of 25% redundant logic across benchmarks through trace-aware optimization.
Problem

Research questions and friction points this paper is trying to address.

Verilog code generation
functional correctness
large language models
hardware description languages
semantic errors
Innovation

Methods, ideas, or system contributions that make the work stand out.

trace-aware causal fix
semantic redundancy pruning
concolic testing
RTL generation
LLM for hardware