Hierarchical Online-Scheduling for Energy-Efficient Split Inference with Progressive Transmission

📅 2026-01-13

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

This work addresses the limitations of existing task-level scheduling in device-edge collaborative inference, which suffers from coarse granularity and a lack of complexity awareness, thereby struggling to balance accuracy, latency, and energy consumption. To overcome these challenges, we propose ENACHI, a novel framework featuring a dual-granularity—task and packet—online scheduling architecture. ENACHI integrates a two-layer Lyapunov optimization with a progressive transmission mechanism to dynamically adapt to channel variations and task complexity. It manages long-term energy-accuracy trade-offs through a reference power budget while jointly optimizing dynamic bandwidth allocation, slot-level power control, and DNN partitioning decisions. Experimental results on ImageNet demonstrate that, under strict deadlines, ENACHI improves inference accuracy by 43.12% and reduces energy consumption by 62.13%, while maintaining stable energy efficiency even under multi-user congestion scenarios.

Technology Category

Application Category

📝 Abstract

Device-edge collaborative inference with Deep Neural Networks (DNNs) faces fundamental trade-offs among accuracy, latency and energy consumption. Current scheduling exhibits two drawbacks: a granularity mismatch between coarse, task-level decisions and fine-grained, packet-level channel dynamics, and insufficient awareness of per-task complexity. Consequently, scheduling solely at the task level leads to inefficient resource utilization. This paper proposes a novel ENergy-ACcuracy Hierarchical optimization framework for split Inference, named ENACHI, that jointly optimizes task- and packet-level scheduling to maximize accuracy under energy and delay constraints. A two-tier Lyapunov-based framework is developed for ENACHI, with a progressive transmission technique further integrated to enhance adaptivity. At the task level, an outer drift-plus-penalty loop makes online decisions for DNN partitioning and bandwidth allocation, and establishes a reference power budget to manage the long-term energy-accuracy trade-off. At the packet level, an uncertainty-aware progressive transmission mechanism is employed to adaptively manage per-sample task complexity. This is integrated with a nested inner control loop implementing a novel reference-tracking policy, which dynamically adjusts per-slot transmit power to adapt to fluctuating channel conditions. Experiments on ImageNet dataset demonstrate that ENACHI outperforms state-of-the-art benchmarks under varying deadlines and bandwidths, achieving a 43.12\% gain in inference accuracy with a 62.13\% reduction in energy consumption under stringent deadlines, and exhibits high scalability by maintaining stable energy consumption in congested multi-user scenarios.

Problem

Research questions and friction points this paper is trying to address.

split inference

energy efficiency

online scheduling

task complexity

resource utilization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Scheduling

Split Inference

Progressive Transmission