🤖 AI Summary
This work addresses the challenge of guaranteeing 1 ms ultra-reliable low-latency communication (URLLC) in dense 6G deployments, where dynamic spectrum slicing must simultaneously contend with non-stationary channels, stringent quality-of-service (QoS) constraints, and data privacy requirements. To this end, the paper proposes SliceFed, a novel framework that uniquely integrates federated learning, constrained multi-agent reinforcement learning, and Lagrangian duality to model spectrum slicing as a constrained Markov decision process (CMDP). SliceFed enables privacy-preserving collaborative policy learning through a proximal policy optimization (PPO)-based primal-dual algorithm combined with federated averaging. Experimental results demonstrate that SliceFed achieves nearly 100% compliance with the 1 ms URLLC latency target in dense multi-cell environments, substantially outperforming heuristic and unconstrained baselines while exhibiting strong robustness to variations in traffic load.
📝 Abstract
Dynamic spectrum slicing is a critical enabler for 6G Radio Access Networks (RANs), allowing the coexistence of heterogeneous services. However, optimizing resource allocation in dense, interference-limited deployments remains challenging due to non-stationary channel dynamics, strict Quality-of-Service (QoS) requirements, and the need for data privacy. In this paper, we propose SliceFed, a novel Federated Constrained Multi-Agent Deep Reinforcement Learning (F-MADRL) framework. SliceFed formulates the slicing problem as a Constrained Markov Decision Process (CMDP) where autonomous gNB agents maximize spectral efficiency while explicitly satisfying inter-cell interference budgets and hard ultra-reliable low-latency communication (URLLC) latency deadlines. We employ a Lagrangian primal-dual approach integrated with Proximal Policy Optimization (PPO) to enforce constraints, while Federated Averaging enables collaborative learning without exchanging raw local data. Extensive simulations in a dense multi-cell environment demonstrate that SliceFed converges to a stable, safety-aware policy. Unlike heuristic and unconstrained baselines, SliceFed achieves nearly 100% satisfaction of 1~ms URLLC latency deadlines and exhibits superior robustness to traffic load variations, verifying its potential for reliable and scalable 6G spectrum management.