A Multi-task Supervised Compression Model for Split Computing

📅 2025-01-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the accuracy degradation and latency surge caused by task-specific model partitioning in multi-task collaborative inference on edge devices, this paper proposes a multi-task head-supervised compression mechanism. It jointly learns compact, shared representations in early network layers, unifying shared compression with task-specific prediction optimization. The method integrates task-specific head design, inter-layer feature distillation, and communication-aware pruning to construct a lightweight multi-task supervised compression architecture. Evaluated on ILSVRC, COCO, and PASCAL VOC, it matches or surpasses state-of-the-art lightweight baselines in accuracy while reducing end-to-end latency by 95.4% and mobile energy consumption by 88.2%. This work is the first to introduce multi-task supervised compression into edge-based collaborative inference, significantly improving the accuracy–efficiency trade-off under resource constraints.

Technology Category

Application Category

📝 Abstract
Split computing ($ eq$ split learning) is a promising approach to deep learning models for resource-constrained edge computing systems, where weak sensor (mobile) devices are wirelessly connected to stronger edge servers through channels with limited communication capacity. State-of-theart work on split computing presents methods for single tasks such as image classification, object detection, or semantic segmentation. The application of existing methods to multitask problems degrades model accuracy and/or significantly increase runtime latency. In this study, we propose Ladon, the first multi-task-head supervised compression model for multi-task split computing. Experimental results show that the multi-task supervised compression model either outperformed or rivaled strong lightweight baseline models in terms of predictive performance for ILSVRC 2012, COCO 2017, and PASCAL VOC 2012 datasets while learning compressed representations at its early layers. Furthermore, our models reduced end-to-end latency (by up to 95.4%) and energy consumption of mobile devices (by up to 88.2%) in multi-task split computing scenarios.
Problem

Research questions and friction points this paper is trying to address.

Multi-task Processing
Edge Devices
Computational Segmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

split-computing
multitasking
data-compression
🔎 Similar Papers
No similar papers found.