A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

📅 2023-11-29
🏛️ arXiv.org
📈 Citations: 4
✨ Influential: 0
📄 PDF
🤖 AI Summary
Designing deep learning accelerators for heterogeneous HPC and edge platforms faces key challenges including insufficient parallelism exploitation and excessive data movement overhead. This paper systematically surveys accelerator design methodologies, covering hardware-software co-design, high-level synthesis, domain-specific compilers (e.g., TVM, Halide), design space exploration, and cycle-accurate modeling and simulation. We propose, for the first time, a unified multi-dimensional classification framework that distills two fundamental principles: “minimizing data movement” and “maximizing parallelism.” The survey bridges the gap between architectural overviews and implementation-oriented methodologies, explicitly identifying emerging directions such as approximate computing integrated with reconfigurability. Our work provides both a methodological foundation and practical guidance for developing efficient, scalable AI accelerators—enabling principled design decisions across diverse heterogeneous computing ecosystems.
📝 Abstract
Given their increasing size and complexity, the need for efficient execution of deep neural networks has become increasingly pressing in the design of heterogeneous High-Performance Computing (HPC) and edge platforms, leading to a wide variety of proposals for specialized deep learning architectures and hardware accelerators. The design of such architectures and accelerators requires a multidisciplinary approach combining expertise from several areas, from machine learning to computer architecture, low-level hardware design, and approximate computing. Several methodologies and tools have been proposed to improve the process of designing accelerators for deep learning, aimed at maximizing parallelism and minimizing data movement to achieve high performance and energy efficiency. This paper critically reviews influential tools and design methodologies for Deep Learning accelerators, offering a wide perspective in this rapidly evolving field. This work complements surveys on architectures and accelerators by covering hardware-software co-design, automated synthesis, domain-specific compilers, design space exploration, modeling, and simulation, providing insights into technical challenges and open research directions.
Problem

Research questions and friction points this paper is trying to address.

Efficient execution of large deep neural networks on heterogeneous platforms
Multidisciplinary design of specialized architectures and hardware accelerators
Optimizing parallelism and data movement for performance and energy efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hardware-software co-design for deep learning accelerators
Automated synthesis and domain-specific compilers
Design space exploration with modeling and simulation
🔎 Similar Papers
No similar papers found.
Fabrizio Ferrandi
Fabrizio Ferrandi
Politecnico di Milano, Italy
S
S. Curzel
Politecnico di Milano, Italy
Leandro Fiorin
Leandro Fiorin
Politecnico di Milano, Italy
Daniele Ielmini
Daniele Ielmini
Politecnico di Milano, Italy
Cristina Silvano
Cristina Silvano
Professor of Computer Architecture, Politecnico di Milano, IEEE Fellow
Computer ArchitectureDesign AutomationDesign Space ExplorationEnergy-Aware Computing
Francesco Conti
Francesco Conti
Associate Professor, University of Bologna
Hardware acceleratorsDeep LearningUltra-Low Power Computing
Alessio Burrello
Alessio Burrello
Politecnico di Torino, University of Bologna
Machine learningDeep LearningTinyMLEmbedded Programming
F
Francesco Barchi
UniversitĂ  di Bologna, Italy
Luca Benini
Luca Benini
ETH ZĂźrich, UniversitĂ  di Bologna
Integrated CircuitsComputer ArchitectureEmbedded SystemsVLSIMachine Learning
L
Luciano Lavagno
Politecnico di Torino, Italy
T
Teodoro Urso
Politecnico di Torino, Italy
E
E. Calore
UniversitĂ  degli Studi di Ferrara, Italy
S
S. Schifano
UniversitĂ  degli Studi di Ferrara, Italy
Cristian Zambelli
Cristian Zambelli
UniversitĂ  degli Studi di Ferrara, Italy
M
M. Palesi
UniversitĂ  degli Studi di Catania, Italy
G
G. Ascia
UniversitĂ  degli Studi di Catania, Italy
E
Enrico Russo
UniversitĂ  degli Studi di Catania, Italy
N
N. Petra
UniversitĂ  degli Studi di Napoli Federico II, Italy
D
D. Caro
UniversitĂ  degli Studi di Napoli Federico II, Italy
G
G. Meo
UniversitĂ  degli Studi di Napoli Federico II, Italy
V
V. Cardellini
Università degli Studi di Roma “Tor Vergata”, Italy
S
Salvatore Filippone
Università degli Studi di Roma “Tor Vergata”, Italy
F
F. L. Presti
Università degli Studi di Roma “Tor Vergata”, Italy
Francesco Silvestri
Francesco Silvestri
Associate professor, University of Padova
Algorithms and data structuresalgorithm for mobilitysimilarity searchparallel computingmemory hierarchies
P
P. Palazzari
ENEA, Italy
S
Stefania Perri
UniversitĂ  degli Studi della Calabria, Italy