đ¤ AI Summary
Selecting and optimizing deep learning (DL) accelerators for high-performance computing (HPC) environments has become increasingly critical due to growing demands for AIâHPC convergence.
Method: This paper presents a systematic survey of mainstream and emerging hardware acceleration technologies from 2019 to 2024, covering GPUs/TPUs, FPGAs/ASICs, RISC-V co-processors, in-memory computing (3D-stacked resistive/phase-change memory), neuromorphic processors, chiplet-based packaging, photonic, and quantum accelerators. We propose the first cross-architectural classification framework integrating classical and frontier paradigms to clarify technological evolution trajectories and HPC-specific bottlenecks.
Contribution/Results: Based on analysis of over 100 representative works, we construct a high-impact DL accelerator technology landscape, deliver scalable design guidelines for heterogeneous HPC platforms, and introduce quantitative benchmarking criteria for accelerator selectionâthereby enabling principled, performance-aware integration of AI and HPC.
đ Abstract
Recent trends in deep learning (DL) imposed hardware accelerators as the most viable solution for several classes of high-performance computing (HPC) applications such as image classification, computer vision, and speech recognition. This survey summarizes and classifies the most recent advances in designing DL accelerators suitable to reach the performance requirements of HPC applications. In particular, it highlights the most advanced approaches to support deep learning accelerations including not only GPU and TPU-based accelerators but also design-specific hardware accelerators such as FPGA-based and ASIC-based accelerators, Neural Processing Units, open hardware RISC-V-based accelerators and co-processors. The survey also describes accelerators based on emerging memory technologies and computing paradigms, such as 3D-stacked Processor-In-Memory, non-volatile memories (mainly, Resistive RAM and Phase Change Memories) to implement in-memory computing, Neuromorphic Processing Units, and accelerators based on Multi-Chip Modules. Among emerging technologies, we also include some insights into quantum-based accelerators and photonics. To conclude, the survey classifies the most influential architectures and technologies proposed in the last years, with the purpose of offering the reader a comprehensive perspective in the rapidly evolving field of deep learning.