🤖 AI Summary
This work addresses the limitations of existing task-level scheduling in device-edge collaborative inference, which suffers from coarse granularity and a lack of complexity awareness, thereby struggling to balance accuracy, latency, and energy consumption. To overcome these challenges, we propose ENACHI, a novel framework featuring a dual-granularity—task and packet—online scheduling architecture. ENACHI integrates a two-layer Lyapunov optimization with a progressive transmission mechanism to dynamically adapt to channel variations and task complexity. It manages long-term energy-accuracy trade-offs through a reference power budget while jointly optimizing dynamic bandwidth allocation, slot-level power control, and DNN partitioning decisions. Experimental results on ImageNet demonstrate that, under strict deadlines, ENACHI improves inference accuracy by 43.12% and reduces energy consumption by 62.13%, while maintaining stable energy efficiency even under multi-user congestion scenarios.
📝 Abstract
Device-edge collaborative inference with Deep Neural Networks (DNNs) faces fundamental trade-offs among accuracy, latency and energy consumption. Current scheduling exhibits two drawbacks: a granularity mismatch between coarse, task-level decisions and fine-grained, packet-level channel dynamics, and insufficient awareness of per-task complexity. Consequently, scheduling solely at the task level leads to inefficient resource utilization. This paper proposes a novel ENergy-ACcuracy Hierarchical optimization framework for split Inference, named ENACHI, that jointly optimizes task- and packet-level scheduling to maximize accuracy under energy and delay constraints. A two-tier Lyapunov-based framework is developed for ENACHI, with a progressive transmission technique further integrated to enhance adaptivity. At the task level, an outer drift-plus-penalty loop makes online decisions for DNN partitioning and bandwidth allocation, and establishes a reference power budget to manage the long-term energy-accuracy trade-off. At the packet level, an uncertainty-aware progressive transmission mechanism is employed to adaptively manage per-sample task complexity. This is integrated with a nested inner control loop implementing a novel reference-tracking policy, which dynamically adjusts per-slot transmit power to adapt to fluctuating channel conditions. Experiments on ImageNet dataset demonstrate that ENACHI outperforms state-of-the-art benchmarks under varying deadlines and bandwidths, achieving a 43.12\% gain in inference accuracy with a 62.13\% reduction in energy consumption under stringent deadlines, and exhibits high scalability by maintaining stable energy consumption in congested multi-user scenarios.