Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition

📅 2024-03-12
🏛️ arXiv.org
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
Structured sparse accelerators face compatibility challenges with unstructured sparse DNNs and require costly fine-tuning. Method: This paper proposes TASD, a universal tensor approximation framework centered on an algebraic-distributivity–driven structured tensor decomposition paradigm—the first of its kind—that losslessly decomposes arbitrary sparse tensors into multiple structured sub-tensors, enabling hardware-agnostic, plug-and-play deployment. TASD integrates layer-aware decomposition search with linear-algebra–guided sparse abstraction and is implemented via the TASDER software framework, supporting cross-platform acceleration (e.g., NVIDIA Sparse Tensor Cores, Cambricon). Contribution/Results: Without model modification or fine-tuning overhead, TASD achieves an average 74% reduction in energy-delay product across diverse off-the-shelf dense and sparse DNNs—up to 83%—significantly mitigating hardware ecosystem fragmentation in sparse acceleration.

Technology Category

Application Category

📝 Abstract
Exploiting sparsity in deep neural networks (DNNs) has been a promising area to meet the growing computation need of modern DNNs. However, in practice, sparse DNN acceleration still faces a key challenge. To minimize the overhead of sparse acceleration, hardware designers have proposed structured sparse hardware support recently, which provides limited flexibility and requires extra model fine-tuning. Moreover, any sparse model fine-tuned for certain structured sparse hardware cannot be accelerated by other structured hardware. To bridge the gap between sparse DNN models and hardware, this paper proposes tensor approximation via structured decomposition (TASD), which leverages the distributive property in linear algebra to turn any sparse tensor into a series of structured sparse tensors. Next, we develop a software framework, TASDER, to accelerate DNNs by searching layer-wise, high-quality structured decomposition for both weight and activation tensors so that they can be accelerated by any systems with structured sparse hardware support. Evaluation results show that, by exploiting prior structured sparse hardware baselines, our method can accelerate off-the-shelf dense and sparse DNNs without fine-tuning and improves energy-delay-product by up to 83% and 74% on average.
Problem

Research questions and friction points this paper is trying to address.

Enabling unstructured sparsity acceleration on structured hardware
Reducing model fine-tuning for cross-hardware compatibility
Improving energy-delay-product and speed in DNNs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Approximation method for unstructured sparsity conversion
TASDER software framework for structured approximation
Accelerates DNNs without fine-tuning, improves EDP
🔎 Similar Papers
No similar papers found.