π€ AI Summary
This work addresses the challenge of achieving low-cost, reliable transmission for delay-sensitive applications under stringent per-packet end-to-end latency constraints. The problem is formulated as a Constrained Markov Decision Process (CMDP), and the paper introduces, for the first time, Constrained Deep Reinforcement Learning (CDRL) to jointly optimize network resource allocation and dynamic routing. Unlike conventional approaches that rely on average delay optimization, the proposed method directly enforces per-packet latency and reliability guarantees. Experimental results demonstrate that the approach significantly reduces resource allocation costs while simultaneously satisfying high reliability and timely delivery requirements, outperforming existing baseline methods.
π Abstract
Next-generation networks aim to provide performance guarantees to real-time interactive services that require timely and cost-efficient packet delivery. In this context, the goal is to reliably deliver packets with strict deadlines imposed by the application while minimizing overall resource allocation cost. A large body of work has leveraged stochastic optimization techniques to design efficient dynamic routing and scheduling solutions under average delay constraints; however, these methods fall short when faced with strict per-packet delay requirements. We formulate the minimum-cost delay-constrained network control problem as a constrained Markov decision process and utilize constrained deep reinforcement learning (CDRL) techniques to effectively minimize total resource allocation cost while maintaining timely throughput above a target reliability level. Results indicate that the proposed CDRL-based solution can ensure timely packet delivery even when existing baselines fall short, and it achieves lower cost compared to other throughput-maximizing methods.