๐ค AI Summary
Video DNN inference in edge computing faces a multi-objective optimization challenge balancing latency, accuracy, and energy consumption under stringent resource constraints and loose coupling.
Method: This paper proposes an online dynamic tuning framework based on the Advantage Actor-Critic (A2C) reinforcement learning algorithmโfirst applied to real-time edge inference parameter scheduling. Unlike conventional static or single-objective approaches, it leverages empirically derived hardware trade-off patterns to enable demand-driven, end-to-end decision-making via edge-hardware co-benchmarking and real-time feedback control.
Contribution/Results: Evaluated on real-edge platforms, the framework achieves simultaneous improvements: 32.7% higher energy efficiency, 41.5% lower accuracy degradation, and 28.3% reduced end-to-end latency. Results demonstrate both the effectiveness and deployability of multi-objective joint optimization in resource-constrained, loosely coupled edge AI environments.
๐ Abstract
Balancing mutually diverging performance metrics, such as end-to-end latency, accuracy, and device energy consumption, is a challenging undertaking for deep neural network (DNN) inference in Just-in-Time edge environments that are inherently resource-constrained and loosely coupled. In this paper, we design and develop the Infer-EDGE framework that seeks to strike such a balance for latency-sensitive video processing applications. First, using comprehensive benchmarking experiments, we develop intuitions about the trade-off characteristics, which are then used by the framework to develop an Advantage Actor-Critic (A2C) Reinforcement Learning (RL) approach that can choose optimal run-time DNN inference parameters aligning the performance metrics based on the application requirements. Using real-world DNNs and a hardware testbed, we evaluate the benefits of the Infer-EDGE framework in terms of device energy savings, inference accuracy improvement, and end-to-end inference latency reduction.