🤖 AI Summary
This work addresses the challenges of manually writing SystemVerilog Assertions (SVA), which is time-consuming and error-prone, as well as the frequent syntactic and semantic inaccuracies in SVA code directly generated by general-purpose large language models (LLMs). To overcome these limitations, the paper proposes the SVA Generator framework, which constructs high-fidelity training data through abstract syntax tree (AST) constraint injection, formal semantic equivalence verification, and deduplication mechanisms. An automated supervised pipeline further ensures structural consistency and semantic correctness of the generated assertions. Evaluated on benchmarks of varying complexity (D2–D4), the approach achieves an average improvement of 22.7 percentage points in semantic equivalence rate over general LLM baselines, while maintaining comparable syntactic pass rates.
📝 Abstract
Functional verification remains a dominant cost in modern IC development, and SystemVerilog Assertions (SVAs) are critical for simulation-based monitoring and formal property checking. However, writing SVAs by hand is time-consuming and error-prone. Directly prompting general-purpose large language models (LLMs) is also unreliable: the generated properties are often syntactically invalid or semantically incorrect, and the problem is exacerbated by scarce, high-quality domain training data. We present SVA Generator, a data-centric framework that translates natural-language SVA Descriptions (SVADs) into executable SVAs. It uses AST-grounded constraint injection and an automated supervision pipeline that enforces structural consistency and reduces hallucinations via de-duplication and constraint checks. To enable rigorous evaluation, we introduce a benchmark suite stratified by AST depth and use formal property equivalence checking to quantify semantic correctness separately from syntax validity, by checking mutual implication between the generated and reference properties under the same clocking and environment assumptions. Across all difficulty tiers, SVA Generator achieves comparable Syntax Pass Rate (SPR) to strong general LLM baselines, while delivering substantially higher Semantic Equivalence Rate (SER) on deeper tiers: +24.5 pp on D2, +26.0 pp on D3, and +17.5 pp on D4 relative to the best-performing general LLM, corresponding to a +22.7 pp SER improvement on average over D2--D4. These results highlight that high-fidelity data construction and depth-stratified benchmarking are key to reliable, semantics-preserving SVA generation.