AIRCHITECT v2: Learning the Hardware Accelerator Design Space through Unified Representations

📅 2025-01-17

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

To address low search efficiency and poor generalization in AI chip design space exploration (DSE) caused by non-convex, high-dimensional, and irregular design spaces, this paper proposes an end-to-end, constant-time hardware accelerator architecture recommendation method. We introduce a novel unified classification-regression representation paradigm and design a hardware-aware, contrastive learning–enhanced encoder-decoder Transformer, enabling uniform intermediate representations of the design space and strong cross-model generalization. Evaluated on 10⁵ real-world DNN workloads, our method improves optimal design point identification accuracy by 15%. Moreover, for unseen large language model (LLM) workloads, it reduces hardware deployment inference latency by 1.7× compared to prior approaches. The framework achieves both computational efficiency—via constant-time inference—and broad architectural adaptability—without retraining—thereby advancing scalable, generalizable DSE for heterogeneous AI accelerators.

Technology Category

Application Category

📝 Abstract

Design space exploration (DSE) plays a crucial role in enabling custom hardware architectures, particularly for emerging applications like AI, where optimized and specialized designs are essential. With the growing complexity of deep neural networks (DNNs) and the introduction of advanced foundational models (FMs), the design space for DNN accelerators is expanding at an exponential rate. Additionally, this space is highly non-uniform and non-convex, making it increasingly difficult to navigate and optimize. Traditional DSE techniques rely on search-based methods, which involve iterative sampling of the design space to find the optimal solution. However, this process is both time-consuming and often fails to converge to the global optima for such design spaces. Recently, AIrchitect v1, the first attempt to address the limitations of search-based techniques, transformed DSE into a constant-time classification problem using recommendation networks. In this work, we propose AIrchitect v2, a more accurate and generalizable learning-based DSE technique applicable to large-scale design spaces that overcomes the shortcomings of earlier approaches. Specifically, we devise an encoder-decoder transformer model that (a) encodes the complex design space into a uniform intermediate representation using contrastive learning and (b) leverages a novel unified representation blending the advantages of classification and regression to effectively explore the large DSE space without sacrificing accuracy. Experimental results evaluated on 10^5 real DNN workloads demonstrate that, on average, AIrchitect v2 outperforms existing techniques by 15% in identifying optimal design points. Furthermore, to demonstrate the generalizability of our method, we evaluate performance on unseen model workloads (LLMs) and attain a 1.7x improvement in inference latency on the identified hardware architecture.

Problem

Research questions and friction points this paper is trying to address.

DNN Accelerator Design

Optimization

Custom Hardware

Innovation

Methods, ideas, or system contributions that make the work stand out.

AIrchitect v2

efficient hardware design

performance enhancement

🔎 Similar Papers

Dynamic Co-Optimization Compiler: Leveraging Multi-Agent Reinforcement Learning for Enhanced DNN Accelerator Performance