SafeSlice: Enabling SLA-Compliant O-RAN Slicing via Safe Deep Reinforcement Learning

📅 2025-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of guaranteeing end-to-end latency SLAs in O-RAN network slicing, this paper proposes SafeSlice, a safety-aware deep reinforcement learning framework. Methodologically, SafeSlice introduces (1) a sigmoid risk-sensitive reward function that explicitly encodes trajectory-level cumulative latency constraints, and (2) a supervised safety-layer cost model that projects DRL actions in real time onto the feasible region satisfying instantaneous latency constraints—thereby ensuring dual-layer latency compliance. Experimental results demonstrate that, compared to baseline approaches, SafeSlice reduces average cumulative latency by 83.23%, decreases instantaneous latency violations by 93.24%, and lowers resource consumption by 22.13%, while exhibiting strong robustness to dynamic latency thresholds.

Technology Category

Application Category

📝 Abstract
Deep reinforcement learning (DRL)-based slicing policies have shown significant success in simulated environments but face challenges in physical systems such as open radio access networks (O-RANs) due to simulation-to-reality gaps. These policies often lack safety guarantees to ensure compliance with service level agreements (SLAs), such as the strict latency requirements of immersive applications. As a result, a deployed DRL slicing agent may make resource allocation (RA) decisions that degrade system performance, particularly in previously unseen scenarios. Real-world immersive applications require maintaining SLA constraints throughout deployment to prevent risky DRL exploration. In this paper, we propose SafeSlice to address both the cumulative (trajectory-wise) and instantaneous (state-wise) latency constraints of O-RAN slices. We incorporate the cumulative constraints by designing a sigmoid-based risk-sensitive reward function that reflects the slices' latency requirements. Moreover, we build a supervised learning cost model as part of a safety layer that projects the slicing agent's RA actions to the nearest safe actions, fulfilling instantaneous constraints. We conduct an exhaustive experiment that supports multiple services, including real virtual reality (VR) gaming traffic, to investigate the performance of SafeSlice under extreme and changing deployment conditions. SafeSlice achieves reductions of up to 83.23% in average cumulative latency, 93.24% in instantaneous latency violations, and 22.13% in resource consumption compared to the baselines. The results also indicate SafeSlice's robustness to changing the threshold configurations of latency constraints, a vital deployment scenario that will be realized by the O-RAN paradigm to empower mobile network operators (MNOs).
Problem

Research questions and friction points this paper is trying to address.

Ensures SLA compliance in O-RAN slicing using safe DRL.
Addresses simulation-to-reality gaps in DRL-based slicing policies.
Reduces latency violations and resource consumption in O-RAN systems.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sigmoid-based risk-sensitive reward function
Supervised learning cost model safety layer
Reduces latency and resource consumption significantly
🔎 Similar Papers
No similar papers found.