🤖 AI Summary
This work addresses the challenge of real-time resource orchestration in heterogeneous edge computing, where multidimensional resource heterogeneity, diverse task requirements, and dynamic network conditions complicate efficient scheduling. To tackle this, the authors propose HiRL, a hierarchical reinforcement learning framework that decouples the mixed continuous-discrete optimization problem into two coordinated levels: power control via Twin Delayed Deep Deterministic Policy Gradient (TD3) and task assignment using Double Deep Q-Networks (DDQN). Decision coordination is achieved through a five-dimensional queue state representation. The approach innovatively integrates continuous and discrete reinforcement learning with deadline-aware priority scheduling, adaptive sampling based on failure penalties, and a compatibility-aware evaluation mechanism for heterogeneous resources. Experiments demonstrate that HiRL reduces latency by 28% and achieves near-100% task completion rates compared to Single-DDQN, cuts energy consumption by up to 51% under low load, and improves delay by 24% over static methods under high load, significantly enhancing the trade-off between energy efficiency and latency.
📝 Abstract
Edge computing faces unprecedented resource orchestration challenges from multi-dimensional heterogeneity across device architectures, diverse task requirements in CPU-intensive, GPU-intensive, I/O-intensive, and dynamic network conditions. The edge environments demand real-time task processing within strict energy budgets, yet conventional approaches struggle with mixed continuous-discrete optimization while meeting deadline and energy constraints. This paper presents HiRL, a hierarchical reinforcement learning framework that decomposes complex resource orchestration into coordinated power control and task allocation decisions. Our approach separates continuous power management using the Twin Delayed Deep Deterministic Policy Gradient (TD3) and discrete task placement using Double Deep Q-Network (DDQN), unified through a coordination engine with five-dimensional queue state representation. We propose a heterogeneous assessment of resource compatibility with deadline-oriented prioritization and failure-penalized adaptive sampling to enhance decision quality under resource constraints. To improve practical applicability, the framework models comprehensive system dynamics including device mobility, queue congestion patterns, infrastructure heterogeneity, and priority-sensitive scheduling demands. Experimental results show that HiRL achieves effective latency-energy trade-offs with 28% latency reduction compared to Single-DDQN and maintains nearly 100% task completion rates under all load conditions. Compared to baseline algorithms, HiRL reduces energy consumption by up to 51% under low load while achieving 24% better latency performance than static optimization approaches under high load, establishing effective resource orchestration in heterogeneous edge environments.