LLMulator: Generalizable Cost Modeling for Dataflow Accelerators with Input-Adaptive Control Flow

📅 2025-08-25

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

Existing methods struggle to universally model architecture-, application-, and input-dependent control flow, resulting in low accuracy and slow speed for dataflow accelerator performance prediction—hindering efficient design-space exploration. This paper proposes the first general-purpose performance prediction framework for dataflow accelerators: it models performance values as categorical token sequences, integrating large language models for program semantic encoding and program-structure awareness; introduces a reinforcement learning–driven dynamic calibration mechanism to adapt to input-variable control flow; and designs a multi-level dataflow pattern generation scheme with progressive data augmentation to enhance cross-architecture generalization. Experiments demonstrate that the framework achieves range-agnostic prediction and confidence estimation on unseen applications, reducing loop-cycle prediction error to 11.2%—a 9.7% improvement over static baselines.

Technology Category

Application Category

📝 Abstract

Accurate and fast performance prediction for dataflow-based accelerators is vital for efficient hardware design and design space exploration, yet existing methods struggle to generalize across architectures, applications, and input-dependent control flows. We present LLMulator, a progressive numeric modeling framework leveraging the program semantic knowledge of pre-trained large language models (LLMs) for robust, hardware- and application-aware prediction. Our numeric model treats performance values as categorical token sequences, enabling range-agnostic estimates and confidence-aware predictions for unseen applications. To handle input-dependent control flows, we introduce a reinforcement learning-based dynamic calibration method, reducing cycle prediction error by 9.7% over static models and converging to 11.2% error after a few iterations. For cross-hardware generalization, we develop a progressive data augmentation strategy that generates diverse datasets covering multi-level dataflow structures, memory parameters, and loop mapping primitives, significantly boosting prediction accuracy across architectures and configurations.

Problem

Research questions and friction points this paper is trying to address.

Accurate performance prediction for dataflow accelerators across architectures

Handling input-dependent control flows in hardware performance modeling

Generalizing performance models across diverse applications and configurations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging LLMs for hardware-aware performance prediction

Reinforcement learning for input-dependent control flow

Progressive data augmentation for cross-hardware generalization

🔎 Similar Papers

Efficient Heterogeneous Large Language Model Decoding with Model-Attention Disaggregation