LLMulator: Generalizable Cost Modeling for Dataflow Accelerators with Input-Adaptive Control Flow

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods struggle to universally model architecture-, application-, and input-dependent control flow, resulting in low accuracy and slow speed for dataflow accelerator performance prediction—hindering efficient design-space exploration. This paper proposes the first general-purpose performance prediction framework for dataflow accelerators: it models performance values as categorical token sequences, integrating large language models for program semantic encoding and program-structure awareness; introduces a reinforcement learning–driven dynamic calibration mechanism to adapt to input-variable control flow; and designs a multi-level dataflow pattern generation scheme with progressive data augmentation to enhance cross-architecture generalization. Experiments demonstrate that the framework achieves range-agnostic prediction and confidence estimation on unseen applications, reducing loop-cycle prediction error to 11.2%—a 9.7% improvement over static baselines.

Technology Category

Application Category

📝 Abstract
Accurate and fast performance prediction for dataflow-based accelerators is vital for efficient hardware design and design space exploration, yet existing methods struggle to generalize across architectures, applications, and input-dependent control flows. We present LLMulator, a progressive numeric modeling framework leveraging the program semantic knowledge of pre-trained large language models (LLMs) for robust, hardware- and application-aware prediction. Our numeric model treats performance values as categorical token sequences, enabling range-agnostic estimates and confidence-aware predictions for unseen applications. To handle input-dependent control flows, we introduce a reinforcement learning-based dynamic calibration method, reducing cycle prediction error by 9.7% over static models and converging to 11.2% error after a few iterations. For cross-hardware generalization, we develop a progressive data augmentation strategy that generates diverse datasets covering multi-level dataflow structures, memory parameters, and loop mapping primitives, significantly boosting prediction accuracy across architectures and configurations.
Problem

Research questions and friction points this paper is trying to address.

Accurate performance prediction for dataflow accelerators across architectures
Handling input-dependent control flows in hardware performance modeling
Generalizing performance models across diverse applications and configurations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging LLMs for hardware-aware performance prediction
Reinforcement learning for input-dependent control flow
Progressive data augmentation for cross-hardware generalization
🔎 Similar Papers
No similar papers found.
K
Kaiyan Chang
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China
W
Wenlong Zhu
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China
Shengwen Liang
Shengwen Liang
Institute of computing technology, Chinese Academy of Sciences
AcceleratorCognitive SSDSystem
Huawei Li
Huawei Li
Institute of Computing Technology, Chinese Academy of Sciences
computer engineering
Y
Ying Wang
SKLP, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China