A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

In dynamic heterogeneous networks, latency-critical applications—such as industrial automation, autonomous driving, and augmented reality—demand strict end-to-end peak latency guarantees, yet existing approaches predominantly optimize average latency and fail to ensure deterministic latency bounds. Method: This paper proposes a centralized routing and distributed scheduling co-design framework based on multi-agent deep reinforcement learning (MARL). It innovatively integrates centralized path selection with distributed time-slot allocation, employing an enhanced MADDPG algorithm augmented with domain-knowledge-informed rule-based policies to balance performance and training efficiency. Results: Experiments demonstrate that the framework significantly outperforms conventional random optimization methods in both on-time delivery ratio and throughput, achieving highly reliable, high-throughput transmission under stringent peak latency constraints.

Technology Category

Application Category

📝 Abstract

Timely delivery of delay-sensitive information over dynamic, heterogeneous networks is increasingly essential for a range of interactive applications, such as industrial automation, self-driving vehicles, and augmented reality. However, most existing network control solutions target only average delay performance, falling short of providing strict End-to-End (E2E) peak latency guarantees. This paper addresses the challenge of reliably delivering packets within application-imposed deadlines by leveraging recent advancements in Multi-Agent Deep Reinforcement Learning (MA-DRL). After introducing the Delay-Constrained Maximum-Throughput (DCMT) dynamic network control problem, and highlighting the limitations of current solutions, we present a novel MA-DRL network control framework that leverages a centralized routing and distributed scheduling architecture. The proposed framework leverages critical networking domain knowledge for the design of effective MA-DRL strategies based on the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) technique, where centralized routing and distributed scheduling agents dynamically assign paths and schedule packet transmissions according to packet lifetimes, thereby maximizing on-time packet delivery. The generality of the proposed framework allows integrating both data-driven lue{Deep Reinforcement Learning (DRL)} agents and traditional rule-based policies in order to strike the right balance between performance and learning complexity. Our results confirm the superiority of the proposed framework with respect to traditional stochastic optimization-based approaches and provide key insights into the role and interplay between data-driven DRL agents and new rule-based policies for both efficient and high-performance control of latency-critical services.

Problem

Research questions and friction points this paper is trying to address.

Develops multi-agent reinforcement learning for dynamic network routing

Ensures strict end-to-end latency guarantees for critical services

Maximizes on-time packet delivery through centralized and distributed control

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Agent Deep Reinforcement Learning for network control

Centralized routing with distributed scheduling architecture

Integrating DRL agents with rule-based policies

🔎 Similar Papers

Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations