Efficient Deep Learning Infrastructures for Embedded Computing Systems: A Comprehensive Survey and Future Envision

📅 2024-10-24
🏛️ ACM Transactions on Embedded Computing Systems
📈 Citations: 12
Influential: 0
📄 PDF
🤖 AI Summary
A significant computational gap impedes the deployment of artificial intelligence on embedded systems. Method: This paper presents a systematic survey of efficient deep learning infrastructure for embedded AI, covering network design, model compression, on-device learning, lightweight large language models, and software-hardware co-optimization. It introduces a novel seven-dimensional unified analytical framework that integrates training-inference, algorithm-application, and software-hardware dimensions across the full lifecycle. Contribution/Results: The survey identifies emerging directions—including integrated sensing, communication, and intelligence (ISCI), and neural-symbolic collaboration—and synthesizes over 100 representative works, such as neural architecture search (NAS), pruning/quantization/knowledge distillation, incremental learning, MoE-based model light-weighting, TinyML compilers, in-memory computing architectures, and RISC-V-based AI accelerators. It establishes a reproducible technology evolution map and delivers the first comprehensive, system-level introductory guide and practical deployment roadmap for embedded AI.

Technology Category

Application Category

📝 Abstract
Deep neural networks (DNNs) have recently achieved impressive success across a wide range of real-world vision and language processing tasks, spanning from image classification to many other downstream vision tasks, such as object detection, tracking, and segmentation. However, previous well-established DNNs, despite being able to maintain superior accuracy, have also been evolving to be deeper and wider and thus inevitably necessitate prohibitive computational resources for both training and inference. This trend further enlarges the computational gap between computation-intensive DNNs and resource-constrained embedded computing systems, making it challenging to deploy powerful DNNs in real-world embedded computing systems towards ubiquitous embedded intelligence. To alleviate this computational gap and enable ubiquitous embedded intelligence, we focus in this survey on discussing recent efficient deep learning infrastructures for embedded computing systems, spanning from training to inference, from manual to automated, from convolutional neural networks to transformers, from transformers to vision transformers, from vision models to large language models, from software to hardware, and from algorithms to applications. Specifically, we discuss recent efficient deep learning infrastructures for embedded computing systems from the lens of (1) efficient manual network design for embedded computing systems, (2) efficient automated network design for embedded computing systems, (3) efficient network compression for embedded computing systems, (4) efficient on-device learning for embedded computing systems, (5) efficient large language models for embedded computing systems, (6) efficient deep learning software and hardware for embedded computing systems, and (7) efficient intelligent applications for embedded computing systems. We also envision promising future directions and trends, which have the potential to deliver more ubiquitous embedded intelligence. We believe this survey has its merits and can shed light on future research, which can largely help researchers to quickly and smoothly get started in this emerging field.
Problem

Research questions and friction points this paper is trying to address.

Addressing the computational gap between deep neural networks and resource-constrained embedded systems.
Surveying efficient deep learning infrastructures for embedded systems from training to inference.
Exploring methods like network design, compression, and hardware optimization for embedded deployment.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Survey of efficient deep learning infrastructures for embedded systems
Covers manual and automated network design and compression techniques
Includes software, hardware, and application optimizations for embedded AI
🔎 Similar Papers
No similar papers found.
Xiangzhong Luo
Xiangzhong Luo
Nanyang Technological University
D
Di Liu
Norwegian University of Science and Technology, Norway
H
Hao Kong
Nanyang Technological University, Singapore
Shuo Huai
Shuo Huai
Nanyang Technological University
Edge ComputingModel OptimizationIn-Memory Computing
H
Hui Chen
Nanyang Technological University, Singapore
G
Guochu Xiong
Nanyang Technological University, Singapore
Weichen Liu
Weichen Liu
College of Computing and Data Science, Nanyang Technological University
Embedded SystemsMultiprocessor SystemsNetwork-on-Chip