wa-hls4ml: A Benchmark and Surrogate Models for hls4ml Resource and Latency Estimation

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the bottleneck posed by high-level synthesis (HLS) in machine learning (ML) hardware design iteration, this paper introduces WA-HLS4ML—the first large-scale resource and latency prediction benchmark tailored for hls4ml—comprising over 680,000 FPGA synthesis records across diverse neural network architectures. We propose a novel lightweight surrogate model that synergistically integrates graph neural networks (GNNs) and Transformers to enable rapid, accurate performance estimation (latency and resource utilization) for fully connected and convolutional networks on Xilinx FPGAs. On synthetic test sets, our method achieves prediction errors within a few percentage points at the 75th percentile for both latency and resource metrics, substantially accelerating ML accelerator design cycles. Key contributions include: (1) an open-source, large-scale, and architecturally diverse benchmark dataset; and (2) a high-accuracy, computationally efficient surrogate model that overcomes the speed limitations of conventional HLS synthesis.

Technology Category

Application Category

📝 Abstract
As machine learning (ML) is increasingly implemented in hardware to address real-time challenges in scientific applications, the development of advanced toolchains has significantly reduced the time required to iterate on various designs. These advancements have solved major obstacles, but also exposed new challenges. For example, processes that were not previously considered bottlenecks, such as hardware synthesis, are becoming limiting factors in the rapid iteration of designs. To mitigate these emerging constraints, multiple efforts have been undertaken to develop an ML-based surrogate model that estimates resource usage of ML accelerator architectures. We introduce wa-hls4ml, a benchmark for ML accelerator resource and latency estimation, and its corresponding initial dataset of over 680,000 fully connected and convolutional neural networks, all synthesized using hls4ml and targeting Xilinx FPGAs. The benchmark evaluates the performance of resource and latency predictors against several common ML model architectures, primarily originating from scientific domains, as exemplar models, and the average performance across a subset of the dataset. Additionally, we introduce GNN- and transformer-based surrogate models that predict latency and resources for ML accelerators. We present the architecture and performance of the models and find that the models generally predict latency and resources for the 75% percentile within several percent of the synthesized resources on the synthetic test dataset.
Problem

Research questions and friction points this paper is trying to address.

Hardware synthesis becomes bottleneck for ML accelerator design iteration
Need accurate resource and latency estimation for neural network accelerators
Lack of benchmark for evaluating ML-based FPGA resource predictors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed ML-based surrogate models for estimation
Introduced benchmark with 680,000 neural networks dataset
Used GNN and transformer models for accelerator prediction
🔎 Similar Papers
No similar papers found.
B
B. Hawks
Fermi National Accelerator Laboratory, USA
Jason Weitz
Jason Weitz
University of California, San Diego
Dmitri Demler
Dmitri Demler
Physics Undergraduate
K
Karla Tame-Narvaez
Fermi National Accelerator Laboratory, USA
D
Dennis Plotnikov
Johns Hopkins University, USA
M
M. Rahimifar
University of Sherbrooke, Canada
H
H. Rahali
University of Sherbrooke, Canada
A
Audrey C. Therrien
University of Sherbrooke, Canada
D
Donovan Sproule
Columbia University, USA
Elham E Khoda
Elham E Khoda
University of California San Diego, USA
K
Keegan Smith
Texas A&M University, USA
R
Russell Marroquin
University of California San Diego, USA
G
G. D. Guglielmo
Fermi National Accelerator Laboratory, USA
N
Nhan Tran
Fermi National Accelerator Laboratory, USA
J
Javier Duarte
University of California San Diego, USA
Vladimir Loncar
Vladimir Loncar
CERN