HDLxGraph: Bridging Large Language Models and HDL Repositories via HDL Graph Databases

📅 2025-05-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Large language models (LLMs) exhibit limited comprehension and generation capabilities for warehouse-scale hardware description language (HDL) projects—typically comprising thousands to tens of thousands of lines of code. Method: This paper introduces a dual-view HDL graph database integrating abstract syntax trees (ASTs) and data flow graphs (DFGs), coupled with a task-adaptive Graph Retrieval-Augmented Generation (Graph RAG) framework. It proposes the first multi-granularity, structured semantic joint retrieval mechanism tailored for HDL. Contribution/Results: We construct HDLSearch—the first real-world, warehouse-scale HDL search benchmark. Experimental results show that our approach improves search accuracy, debugging efficiency, and code completion quality by 12.04%, 12.22%, and 5.04%, respectively, over conventional semantic RAG baselines. All artifacts—including source code, graph database construction tools, and the HDLSearch benchmark—are publicly released.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have demonstrated their potential in hardware design tasks, such as Hardware Description Language (HDL) generation and debugging. Yet, their performance in real-world, repository-level HDL projects with thousands or even tens of thousands of code lines is hindered. To this end, we propose HDLxGraph, a novel framework that integrates Graph Retrieval Augmented Generation (Graph RAG) with LLMs, introducing HDL-specific graph representations by incorporating Abstract Syntax Trees (ASTs) and Data Flow Graphs (DFGs) to capture both code graph view and hardware graph view. HDLxGraph utilizes a dual-retrieval mechanism that not only mitigates the limited recall issues inherent in similarity-based semantic retrieval by incorporating structural information, but also enhances its extensibility to various real-world tasks by a task-specific retrieval finetuning. Additionally, to address the lack of comprehensive HDL search benchmarks, we introduce HDLSearch, a multi-granularity evaluation dataset derived from real-world repository-level projects. Experimental results demonstrate that HDLxGraph significantly improves average search accuracy, debugging efficiency and completion quality by 12.04%, 12.22% and 5.04% compared to similarity-based RAG, respectively. The code of HDLxGraph and collected HDLSearch benchmark are available at https://github.com/Nick-Zheng-Q/HDLxGraph.

Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM performance in large HDL projects

Integrating graph RAG with ASTs and DFGs

Addressing lack of HDL search benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates Graph RAG with LLMs for HDL

Uses ASTs and DFGs for code representation

Introduces HDLSearch benchmark for evaluation

🔎 Similar Papers

No similar papers found.

Authors to Follow