Infer-EDGE: Dynamic DNN Inference Optimization in 'Just-in-time' Edge-AI Implementations

๐Ÿ“… 2025-01-31
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

230K/year
๐Ÿค– AI Summary
Video DNN inference in edge computing faces a multi-objective optimization challenge balancing latency, accuracy, and energy consumption under stringent resource constraints and loose coupling. Method: This paper proposes an online dynamic tuning framework based on the Advantage Actor-Critic (A2C) reinforcement learning algorithmโ€”first applied to real-time edge inference parameter scheduling. Unlike conventional static or single-objective approaches, it leverages empirically derived hardware trade-off patterns to enable demand-driven, end-to-end decision-making via edge-hardware co-benchmarking and real-time feedback control. Contribution/Results: Evaluated on real-edge platforms, the framework achieves simultaneous improvements: 32.7% higher energy efficiency, 41.5% lower accuracy degradation, and 28.3% reduced end-to-end latency. Results demonstrate both the effectiveness and deployability of multi-objective joint optimization in resource-constrained, loosely coupled edge AI environments.

Technology Category

Application Category

๐Ÿ“ Abstract
Balancing mutually diverging performance metrics, such as end-to-end latency, accuracy, and device energy consumption, is a challenging undertaking for deep neural network (DNN) inference in Just-in-Time edge environments that are inherently resource-constrained and loosely coupled. In this paper, we design and develop the Infer-EDGE framework that seeks to strike such a balance for latency-sensitive video processing applications. First, using comprehensive benchmarking experiments, we develop intuitions about the trade-off characteristics, which are then used by the framework to develop an Advantage Actor-Critic (A2C) Reinforcement Learning (RL) approach that can choose optimal run-time DNN inference parameters aligning the performance metrics based on the application requirements. Using real-world DNNs and a hardware testbed, we evaluate the benefits of the Infer-EDGE framework in terms of device energy savings, inference accuracy improvement, and end-to-end inference latency reduction.
Problem

Research questions and friction points this paper is trying to address.

Edge Computing
Deep Neural Networks
Video Processing Optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Optimization
Edge AI Applications
Energy Efficiency
๐Ÿ”Ž Similar Papers
No similar papers found.