VUSA: Virtually Upscaled Systolic Array Architecture to Exploit Unstructured Sparsity in AI Acceleration

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of efficiently exploiting unstructured sparsity in deep neural networks (DNNs) for edge AI, this paper proposes a virtually scalable systolic array architecture. The architecture introduces a novel virtual expansion mechanism that dynamically adapts to diverse sparsity patterns without physical reconfiguration, enabling adaptive computation granularity scaling—from dense to highly sparse DNN workloads—within a fixed hardware footprint. Integrated with sparsity-aware dataflow scheduling, virtual mapping and reconfiguration control, and a custom 16nm reusable multiply-accumulate (MAC) unit, the design achieves significant hardware efficiency gains. Evaluated against conventional systolic arrays delivering identical peak throughput, the proposed architecture reduces silicon area by 37% and improves energy efficiency by 68%. These results demonstrate a compelling trade-off among generality, energy efficiency, and hardware utilization for edge AI accelerators.

Technology Category

Application Category

📝 Abstract
Leveraging high degrees of unstructured sparsity is a promising approach to enhance the efficiency of deep neural network DNN accelerators - particularly important for emerging Edge-AI applications. We introduce VUSA, a systolic-array architecture that virtually grows based on the present sparsity to perform larger matrix multiplications with the same number of physical multiply-accumulate MAC units. The proposed architecture achieves saving by 37% and 68% in area and power efficiency, respectively, at the same peak-performance, compared to a baseline systolic array architecture in a commercial 16-nm technology. Still, the proposed architecture supports acceleration for any DNN with any sparsity - even no sparsity at all. Thus, the proposed architecture is application-independent, making it viable for general-purpose AI acceleration.
Problem

Research questions and friction points this paper is trying to address.

Exploiting unstructured sparsity in AI accelerators
Enhancing efficiency of DNN accelerators for Edge-AI
Achieving area and power savings in systolic arrays
Innovation

Methods, ideas, or system contributions that make the work stand out.

Virtually upscaled systolic array for sparsity
Same MAC units for larger multiplications
37% area and 68% power savings
🔎 Similar Papers
No similar papers found.
S
Shereef Helal
NXP Semiconductors, Munich, Germany
A
Alberto García-Ortiz
University of Bremen, Bremen, Germany
Lennart Bamberg
Lennart Bamberg
Senior Principal Architect @ NXP
computer architecturesAI/ML hardwarelow-power designinterconnect architectures