Intra-DP: A High Performance Collaborative Inference System for Mobile Edge Computing

πŸ“… 2025-07-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In mobile edge computing, DNN inference faces severe transmission bottlenecks and resource constraints due to conventional layer-wise partitioning and sequential execution. To address this, we propose an operator-level collaborative inference system. Our method breaks inter-layer dependencies by decomposing models into local operators and enabling fine-grained parallel scheduling, thereby deeply overlapping subtask computation with cross-device communication. Crucially, it co-designs the inference strategy with intrinsic model structural characteristics to optimize end-edge collaborative execution. Experimental evaluation demonstrates that, compared to state-of-the-art approaches, our system reduces single-inference latency by up to 50% and energy consumption by up to 75%, while strictly preserving the original model’s accuracy.

Technology Category

Application Category

πŸ“ Abstract
Deploying deep neural networks (DNNs) on resource-constrained mobile devices presents significant challenges, particularly in achieving real-time performance while simultaneously coping with limited computational resources and battery life. While Mobile Edge Computing (MEC) offers collaborative inference with GPU servers as a promising solution, existing approaches primarily rely on layer-wise model partitioning and undergo significant transmission bottlenecks caused by the sequential execution of DNN operations. To address this challenge, we present Intra-DP, a high-performance collaborative inference system optimized for DNN inference on MEC. Intra DP employs a novel parallel computing technique based on local operators (i.e., operators whose minimum unit input is not the entire input tensor, such as the convolution kernel). By decomposing their computations (operations) into several independent sub-operations and overlapping the computation and transmission of different sub-operations through parallel execution, Intra-DP mitigates transmission bottlenecks in MEC, achieving fast and energy-efficient inference. The evaluation demonstrates that Intra-DP reduces per-inference latency by up to 50% and energy consumption by up to 75% compared to state-of-the-art baselines, without sacrificing accuracy.
Problem

Research questions and friction points this paper is trying to address.

Achieving real-time DNN inference on resource-constrained mobile devices
Overcoming transmission bottlenecks in Mobile Edge Computing (MEC)
Reducing latency and energy consumption without sacrificing accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parallel computing with local operators
Decomposing computations into independent sub-operations
Overlapping computation and transmission processes
πŸ”Ž Similar Papers
No similar papers found.
Z
Zekai Sun
Department of Computer Science, University of Hong Kong, Pok Fu Lam, Hong Kong SAR, China; Shanghai AI Laboratory, Shanghai, China
X
Xiuxian Guan
Department of Computer Science, University of Hong Kong, Pok Fu Lam, Hong Kong SAR, China
Z
Zheng Lin
Institute of Space Internet, Fudan University, Shanghai 200438, China; School of Computer Science, Fudan University, Shanghai 200438, China
Z
Zihan Fang
Institute of Space Internet, Fudan University, Shanghai 200438, China; School of Computer Science, Fudan University, Shanghai 200438, China
X
Xiangming Cai
School of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China
Z
Zhe Chen
Institute of Space Internet, Fudan University, Shanghai 200438, China; School of Computer Science, Fudan University, Shanghai 200438, China
Fangming Liu
Fangming Liu
Professor, School of Computer Science & Technology, Huazhong University of Science & Technology
AI & Cloud ComputingDatacenterLLM SystemEdge ComputingGreen Computing
Heming Cui
Heming Cui
University of Hong Kong
Operating SystemsProgramming LanguageDistributed SystemsSecurity
J
Jie Xiong
School of Computing and Data Science, Nanyang Technological University, Singapore 639798
Wei Ni
Wei Ni
FIEEE, AAIA Fellow, Senior Principal Scientist & Conjoint Professor, CSIRO/UNSW
6G security and privacyconnected and trusted intelligenceapplied AI/ML
Chau Yuen
Chau Yuen
IEEE Fellow, Highly Cited Researcher, Nanyang Technological University
WirelessSmart GridLocalizationIoTBig Data