CIS: Composable Instruction Set for Data Streaming Applications

📅 2024-06-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional ISAs prioritize computation while neglecting instruction composability and collaboration, limiting scalability and efficiency of hardware accelerators in dataflow applications. This paper proposes a spatiotemporally composable ISA tailored for dataflow execution. Our approach introduces: (1) the first instruction design jointly incorporating spatial and temporal composition; (2) a resource-centric paradigm enabling plug-and-play hardware resource expansion; (3) a lightweight instruction set, dataflow-driven dynamic instruction composition, resource-aware heterogeneous mapping, and multi-level loop auto-unrolling with pipelined scheduling. Experimental evaluation on AI/ML workloads demonstrates near-theoretical-optimal PE utilization, substantially reduced control overhead, and significant end-to-end performance gains over state-of-the-art ISAs.

Technology Category

Application Category

📝 Abstract
The enhanced efficiency of hardware accelerators, including Single Instruction Multiple Data (SIMD) architectures and Coarse-Grained Reconfigurable Architectures (CGRAs), is driving significant advancements in Artificial Intelligence and Machine Learning (AI/ML) applications. These applications frequently involve data streaming operations comprised of numerous vector calculations inherently amenable to parallelization. However, despite considerable progress in hardware accelerator design, their potential remains constrained by conventional instruction set architectures (ISAs). Traditional ISAs, primarily designed for microprocessors and accelerators, emphasize computation while often neglecting instruction composability and inter-instruction cooperation. This limitation results in rigid ISAs that are difficult to extend and suffer from large control overhead in their hardware implementations. To address this, we present a novel composable instruction set (CIS) architecture, designed with both spatial and temporal composability, making it well-suited for data streaming applications. The proposed CIS utilizes a small instruction set, yet efficiently implements complex, multi-level loop structures essential for accelerating data streaming workloads. Furthermore, CIS adopts a resource-centric approach, facilitating straightforward extension through the integration of new hardware resources, enabling the creation of custom, heterogeneous computing platforms. Our results comparing performance between the proposed CIS and other state-of-the-art ISAs demonstrate that a CIS-based architecture significantly outperforms existing solutions, achieving near-optimal processing element (PE) utilization.
Problem

Research questions and friction points this paper is trying to address.

Overcoming rigid ISAs limiting hardware accelerators' potential
Enhancing instruction composability for data streaming applications
Reducing control overhead in parallelizable vector calculations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel composable instruction set (CIS) architecture
Spatial and temporal composability for data streaming
Resource-centric approach for easy extension
🔎 Similar Papers
No similar papers found.
Y
Yu Yang
KTH Royal Institute of Technology, Stockholm, Stockholm, Sweden
J
Jordi Altay'o Gonz'alez
KTH Royal Institute of Technology, Stockholm, Stockholm, Sweden
Paul Delestrac
Paul Delestrac
Postdoc Researcher
Design Automation
A
A. Hemani
KTH Royal Institute of Technology, Stockholm, Stockholm, Sweden