🤖 AI Summary
To address the challenge of jointly ensuring bandwidth scarcity mitigation and QoS isolation in tactical wireless networks, this paper proposes a two-stage deep reinforcement learning (DRL)-driven RAN slicing framework. Stage I performs inter-slice dynamic bandwidth allocation under strict QoS constraints; Stage II executes intra-slice user-level resource scheduling—jointly optimizing spectral efficiency and both inter-slice and intra-slice isolation. We introduce the first DRL-based RAN slicing architecture integrating hard QoS constraints across both stages and present the first systematic O-RAN implementation of end-to-end RAN slicing lifecycle management with scalable deployment of multiple DRL algorithms (PPO, SAC, A2C). Experimental results demonstrate significant improvements over baselines: 32.7% higher slice isolation, 41.5% lower average user latency, and 28.3% higher bandwidth utilization—validating robustness and adaptability across diverse traffic loads.
📝 Abstract
The next generation of tactical networks (TNs) is poised to further leverage the key enablers of 5G and beyond 5G (B5G) technology, such as radio access network (RAN) slicing and the open RAN (O-RAN) paradigm, to unlock multiple architectural options and opportunities for a wide range of innovative applications. RAN slicing and the O-RAN paradigm are considered game changers in TNs, where the former makes it possible to tailor user services to users requirements, and the latter brings openness and intelligence to the management of the RAN. In TNs, bandwidth scarcity requires a dynamic bandwidth slicing strategy. Although this type of strategy ensures efficient bandwidth utilization, it compromises RAN slicing isolation in terms of quality of service (QoS) performance. To deal with this challenge, we propose a deep reinforcement learning (DRL)-based RAN slicing mechanism that achieves a trade-off between efficient RAN bandwidth sharing and appropriate inter- and intra-slice isolation. The proposed mechanism performs bandwidth allocation in two stages. In the first stage, the bandwidth is allocated to the RAN slices. In the second stage, each slice partitions its bandwidth among its associated users. In both stages, the slicing operation is constrained by several considerations related to improving the QoS of slices and users that in turn foster inter- and intra-slice isolation. The proposed RAN slicing mechanism is based on DRL algorithms to perform the bandwidth sharing operation in each stage. We propose to deploy the mechanism in an O-RAN architecture and describe the O-RAN functional blocks and the main DRL model lifecycle management phases involved. We also develop three different implementations of the proposed mechanism, each based on a different DRL algorithm, and evaluate their performance against multiple baselines across various parameters.