Infer-EDGE: Dynamic DNN Inference Optimization in 'Just-in-time' Edge-AI Implementations

📅 2025-01-31

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

Video DNN inference in edge computing faces a multi-objective optimization challenge balancing latency, accuracy, and energy consumption under stringent resource constraints and loose coupling. Method: This paper proposes an online dynamic tuning framework based on the Advantage Actor-Critic (A2C) reinforcement learning algorithm—first applied to real-time edge inference parameter scheduling. Unlike conventional static or single-objective approaches, it leverages empirically derived hardware trade-off patterns to enable demand-driven, end-to-end decision-making via edge-hardware co-benchmarking and real-time feedback control. Contribution/Results: Evaluated on real-edge platforms, the framework achieves simultaneous improvements: 32.7% higher energy efficiency, 41.5% lower accuracy degradation, and 28.3% reduced end-to-end latency. Results demonstrate both the effectiveness and deployability of multi-objective joint optimization in resource-constrained, loosely coupled edge AI environments.

Technology Category

Application Category

📝 Abstract

Balancing mutually diverging performance metrics, such as end-to-end latency, accuracy, and device energy consumption, is a challenging undertaking for deep neural network (DNN) inference in Just-in-Time edge environments that are inherently resource-constrained and loosely coupled. In this paper, we design and develop the Infer-EDGE framework that seeks to strike such a balance for latency-sensitive video processing applications. First, using comprehensive benchmarking experiments, we develop intuitions about the trade-off characteristics, which are then used by the framework to develop an Advantage Actor-Critic (A2C) Reinforcement Learning (RL) approach that can choose optimal run-time DNN inference parameters aligning the performance metrics based on the application requirements. Using real-world DNNs and a hardware testbed, we evaluate the benefits of the Infer-EDGE framework in terms of device energy savings, inference accuracy improvement, and end-to-end inference latency reduction.

Problem

Research questions and friction points this paper is trying to address.

Edge Computing

Deep Neural Networks

Video Processing Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Optimization

Edge AI Applications

Energy Efficiency

🔎 Similar Papers

No similar papers found.