Automatically Improving LLM-based Verilog Generation using EDA Tool Feedback

📅 2024-11-01

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Traditional Verilog hardware design relies heavily on manual debugging, resulting in low efficiency and high error rates; existing LLM-based HDL generation studies predominantly focus on single-turn, zero-shot synthesis without incorporating EDA tool feedback for iterative correction. Method: This paper proposes AutoChip, the first framework to integrate compilation and simulation feedback loops into LLM-driven Verilog generation. It combines conversational large language models (e.g., GPT-4o), iterative prompt engineering, and model-cascaded scheduling to enable automated, closed-loop error detection, localization, and repair. Contribution/Results: We introduce a customizable, multi-model collaborative repair strategy that ensures functional correctness while significantly improving cost-efficiency: compared to zero-shot baselines, design success rate increases by 5.8%, and total computational cost decreases by 89.6%; further adopting a hybrid strategy—small-model pre-filtering followed by large-model refinement—reduces cost by an additional 41.9%.

Technology Category

Application Category

📝 Abstract

Traditionally, digital hardware designs are written in the Verilog hardware description language (HDL) and debugged manually by engineers. This can be time-consuming and error-prone for complex designs. Large Language Models (LLMs) are emerging as a potential tool to help generate fully functioning HDL code, but most works have focused on generation in the single-shot capacity: i.e., run and evaluate, a process that does not leverage debugging and, as such, does not adequately reflect a realistic development process. In this work, we evaluate the ability of LLMs to leverage feedback from electronic design automation (EDA) tools to fix mistakes in their own generated Verilog. To accomplish this, we present an open-source, highly customizable framework, AutoChip, which combines conversational LLMs with the output from Verilog compilers and simulations to iteratively generate and repair Verilog. To determine the success of these LLMs we leverage the VerilogEval benchmark set. We evaluate four state-of-the-art conversational LLMs, focusing on readily accessible commercial models. EDA tool feedback proved to be consistently more effective than zero-shot prompting only with GPT-4o, the most computationally complex model we evaluated. In the best case, we observed a 5.8% increase in the number of successful designs with a 34.2% decrease in cost over the best zero-shot results. Mixing smaller models with this larger model at the end of the feedback iterations resulted in equally as much success as with GPT-4o using feedback, but incurred 41.9% lower cost (corresponding to an overall decrease in cost over zero-shot by 89.6%).

Problem

Research questions and friction points this paper is trying to address.

Automating Verilog code generation using LLMs and EDA feedback.

Reducing manual debugging in digital hardware design processes.

Improving cost-efficiency and success rates in Verilog generation.

Innovation

Methods, ideas, or system contributions that make the work stand out.

AutoChip framework integrates LLMs with EDA tools.

Iterative Verilog generation and repair using EDA feedback.

Cost-effective model mixing reduces computational expenses.

🔎 Similar Papers

No similar papers found.