SecureRAG-RTL: A Retrieval-Augmented, Multi-Agent, Zero-Shot LLM-Driven Framework for Hardware Vulnerability Detection

📅 2026-03-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited effectiveness of large language models (LLMs) in hardware security verification due to the scarcity of hardware description language (HDL) data, which hampers their ability to detect vulnerabilities. To overcome this challenge, the authors propose SecureRAG-RTL, a novel framework that introduces retrieval-augmented generation (RAG) into hardware security for the first time. By integrating multi-agent zero-shot reasoning with domain-specific knowledge retrieval, SecureRAG-RTL effectively compensates for LLMs’ deficiencies in HDL semantics and security rule comprehension. The study also constructs and publicly releases the first HDL benchmark dataset containing real-world vulnerabilities, annotated with 14 distinct flaw types. Experimental results demonstrate that the proposed approach improves vulnerability detection accuracy by approximately 30% across multiple LLMs.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have shown remarkable capabilities in natural language processing tasks, yet their application in hardware security verification remains limited due to scarcity of publicly available hardware description language (HDL) datasets. This knowledge gap constrains LLM performance in detecting vulnerabilities within HDL designs. To address this challenge, we propose SecureRAG-RTL, a novel Retrieval-Augmented Generation (RAG)-based approach that significantly enhances LLM-based security verification of hardware designs. Our approach integrates domain-specific retrieval with generative reasoning, enabling models to overcome inherent limitations in hardware security expertise. We establish baseline vulnerability detection rates using prompt-only methods and then demonstrate that SecureRAG-RTL achieves substantial improvements across diverse LLM architectures, regardless of size. On average, our method increases detection accuracy by about 30%, highlighting its effectiveness in bridging domain knowledge gaps. For evaluation, we curated and annotated a benchmark dataset of 14 HDL designs containing real-world security vulnerabilities, which we will release publicly to support future research. These findings underscore the potential of RAG-driven augmentation to enable scalable, efficient, and accurate hardware security verification workflows.
Problem

Research questions and friction points this paper is trying to address.

hardware security
vulnerability detection
hardware description language
large language models
domain knowledge gap
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation
Hardware Security Verification
Large Language Models
Zero-Shot Learning
HDL Vulnerability Detection
🔎 Similar Papers
No similar papers found.