Are LLMs Ready for Practical Adoption for Assertion Generation?

📅 2025-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The practical utility of large language models (LLMs) for hardware assertion generation remains systematically unassessed, and commercial LLMs produce SystemVerilog assertions riddled with syntactic and semantic errors. Method: We introduce AssertionBench—the first dedicated benchmark for assertion generation—and propose AssertionLLM, the first LLM specifically fine-tuned for this task. Built upon the Transformer architecture, AssertionLLM employs supervised fine-tuning with deep integration of hardware description language (HDL) semantics and assertion modeling patterns. Contribution/Results: Experiments demonstrate that AssertionLLM significantly improves both syntactic correctness and semantic accuracy of generated assertions; its assertion error rate decreases by over 50% compared to leading commercial LLMs. This work fills a critical gap in LLM customization for hardware verification and establishes a new paradigm for trustworthy AI-assisted hardware design.

Technology Category

Application Category

📝 Abstract
Assertions have been the de facto collateral for simulation-based and formal verification of hardware designs for over a decade. The quality of hardware verification, i.e., detection and diagnosis of corner-case design bugs, is critically dependent on the quality of the assertions. With the onset of generative AI such as Transformers and Large-Language Models (LLMs), there has been a renewed interest in developing novel, effective, and scalable techniques of generating functional and security assertions from design source code. While there have been recent works that use commercial-of-the-shelf (COTS) LLMs for assertion generation, there is no comprehensive study in quantifying the effectiveness of LLMs in generating syntactically and semantically correct assertions. In this paper, we first discuss AssertionBench from our prior work, a comprehensive set of designs and assertions to quantify the goodness of a broad spectrum of COTS LLMs for the task of assertion generations from hardware design source code. Our key insight was that COTS LLMs are not yet ready for prime-time adoption for assertion generation as they generate a considerable fraction of syntactically and semantically incorrect assertions. Motivated by the insight, we propose AssertionLLM, a first of its kind LLM model, specifically fine-tuned for assertion generation. Our initial experimental results show that AssertionLLM considerably improves the semantic and syntactic correctness of the generated assertions over COTS LLMs.
Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' readiness for practical assertion generation in hardware verification.
Evaluating LLMs' effectiveness in generating syntactically and semantically correct assertions.
Developing AssertionLLM, a fine-tuned model for improved assertion generation accuracy.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned LLM for assertion generation
Improved syntactic and semantic correctness
Comprehensive benchmark for LLM evaluation