AIRCHITECT v2: Learning the Hardware Accelerator Design Space through Unified Representations

📅 2025-01-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low search efficiency and poor generalization in AI chip design space exploration (DSE) caused by non-convex, high-dimensional, and irregular design spaces, this paper proposes an end-to-end, constant-time hardware accelerator architecture recommendation method. We introduce a novel unified classification-regression representation paradigm and design a hardware-aware, contrastive learning–enhanced encoder-decoder Transformer, enabling uniform intermediate representations of the design space and strong cross-model generalization. Evaluated on 10⁵ real-world DNN workloads, our method improves optimal design point identification accuracy by 15%. Moreover, for unseen large language model (LLM) workloads, it reduces hardware deployment inference latency by 1.7× compared to prior approaches. The framework achieves both computational efficiency—via constant-time inference—and broad architectural adaptability—without retraining—thereby advancing scalable, generalizable DSE for heterogeneous AI accelerators.

Technology Category

Application Category

📝 Abstract
Design space exploration (DSE) plays a crucial role in enabling custom hardware architectures, particularly for emerging applications like AI, where optimized and specialized designs are essential. With the growing complexity of deep neural networks (DNNs) and the introduction of advanced foundational models (FMs), the design space for DNN accelerators is expanding at an exponential rate. Additionally, this space is highly non-uniform and non-convex, making it increasingly difficult to navigate and optimize. Traditional DSE techniques rely on search-based methods, which involve iterative sampling of the design space to find the optimal solution. However, this process is both time-consuming and often fails to converge to the global optima for such design spaces. Recently, AIrchitect v1, the first attempt to address the limitations of search-based techniques, transformed DSE into a constant-time classification problem using recommendation networks. In this work, we propose AIrchitect v2, a more accurate and generalizable learning-based DSE technique applicable to large-scale design spaces that overcomes the shortcomings of earlier approaches. Specifically, we devise an encoder-decoder transformer model that (a) encodes the complex design space into a uniform intermediate representation using contrastive learning and (b) leverages a novel unified representation blending the advantages of classification and regression to effectively explore the large DSE space without sacrificing accuracy. Experimental results evaluated on 10^5 real DNN workloads demonstrate that, on average, AIrchitect v2 outperforms existing techniques by 15% in identifying optimal design points. Furthermore, to demonstrate the generalizability of our method, we evaluate performance on unseen model workloads (LLMs) and attain a 1.7x improvement in inference latency on the identified hardware architecture.
Problem

Research questions and friction points this paper is trying to address.

DNN Accelerator Design
Optimization
Custom Hardware
Innovation

Methods, ideas, or system contributions that make the work stand out.

AIrchitect v2
efficient hardware design
performance enhancement
🔎 Similar Papers
No similar papers found.