🤖 AI Summary
In hardware verification, manually authoring SystemVerilog assertions (SVAs) is labor-intensive and costly, while closed-source LLMs (e.g., GPT-4) suffer from low assertion accuracy and pose licensing and data privacy risks. To address this, we propose VERT—the first high-quality, open-source synthetic SVA dataset tailored for hardware verification. VERT is systematically constructed by applying variable-aware augmentation to open-source HDL code to generate semantically valid assertion pairs. We perform supervised fine-tuning on DeepSeek-Coder 6.7B and Llama-3.1 8B, enabling fully local deployment to ensure data confidentiality. Experiments across four major SoC platforms—including OpenTitan—demonstrate that our fine-tuned models achieve up to 96.88% higher assertion accuracy over baseline models and, for the first time, surpass GPT-4o by 24.14%. VERT is publicly released, establishing a new paradigm for low-cost, trustworthy, and automated SVA generation.
📝 Abstract
Hardware verification is crucial in modern SoC design, consuming around 70% of development time. SystemVerilog assertions ensure correct functionality. However, existing industrial practices rely on manual efforts for assertion generation, which becomes increasingly untenable as hardware systems become complex. Recent research shows that Large Language Models (LLMs) can automate this process. However, proprietary SOTA models like GPT-4o often generate inaccurate assertions and require expensive licenses, while smaller open-source LLMs need fine-tuning to manage HDL code complexities. To address these issues, we introduce **VERT**, an open-source dataset designed to enhance SystemVerilog assertion generation using LLMs. VERT enables researchers in academia and industry to fine-tune open-source models, outperforming larger proprietary ones in both accuracy and efficiency while ensuring data privacy through local fine-tuning and eliminating costly licenses. The dataset is curated by systematically augmenting variables from open-source HDL repositories to generate synthetic code snippets paired with corresponding assertions. Experimental results demonstrate that fine-tuned models like Deepseek Coder 6.7B and Llama 3.1 8B outperform GPT-4o, achieving up to 96.88% improvement over base models and 24.14% over GPT-4o on platforms including OpenTitan, CVA6, OpenPiton and Pulpissimo. VERT is available at https://github.com/AnandMenon12/VERT.