RocketPPA: Ultra-Fast LLM-Based PPA Estimator at Code-Level Abstraction

📅 2025-03-27

📈 Citations: 0

✨ Influential: 0

career value

247K/year

🤖 AI Summary

This work addresses hardware design automation by proposing, for the first time, an end-to-end large language model (LLM) method that directly predicts PPA (power, performance, area) metrics from synthesizable Verilog code—bridging the gap between high-level code generation and physical implementation assessment. To this end, we construct a high-quality, curated dataset of 21,000 Verilog–PPA pairs, enhanced by chain-of-thought–driven data cleaning. We further introduce a hybrid LoRA+MoE architecture that jointly optimizes regression modeling and multi-granularity error calibration. Fine-tuned from CodeLlama, our model achieves significant improvements in prediction accuracy: +5.9%/+7.2% (power), +5.1%/+3.9% (delay), and +4.0%/+7.9% (area) under 20%/10% error thresholds, respectively; the MoE component contributes an additional 3–4% gain across all metrics.

Technology Category

Application Category

📝 Abstract

Large language models have recently transformed hardware design, yet bridging the gap between code synthesis and PPA (power, performance, and area) estimation remains a challenge. In this work, we introduce a novel framework that leverages a 21k dataset of thoroughly cleaned and synthesizable Verilog modules, each annotated with detailed power, delay, and area metrics. By employing chain-of-thought techniques, we automatically debug and curate this dataset to ensure high fidelity in downstream applications. We then fine-tune CodeLlama using LoRA-based parameter-efficient methods, framing the task as a regression problem to accurately predict PPA metrics from Verilog code. Furthermore, we augment our approach with a mixture-of-experts architecture-integrating both LoRA and an additional MLP expert layer-to further refine predictions. Experimental results demonstrate significant improvements: power estimation accuracy is enhanced by 5.9% at a 20% error threshold and by 7.2% at a 10% threshold, delay estimation improves by 5.1% and 3.9%, and area estimation sees gains of 4% and 7.9% for the 20% and 10% thresholds, respectively. Notably, the incorporation of the mixture-of-experts module contributes an additional 3--4% improvement across these tasks. Our results establish a new benchmark for PPA-aware Verilog generation, highlighting the effectiveness of our integrated dataset and modeling strategies for next-generation EDA workflows.

Problem

Research questions and friction points this paper is trying to address.

Bridging code synthesis and PPA estimation in hardware design

Predicting PPA metrics from Verilog code using LLMs

Improving accuracy of power, delay, and area estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tune CodeLlama with LoRA for PPA regression

Use mixture-of-experts with LoRA and MLP

Leverage cleaned Verilog dataset with CoT techniques

🔎 Similar Papers

Retrieval-augmented code completion for local projects using large language models