🤖 AI Summary
To address performance bottlenecks in multi-access-point (MAPC) coordinated scheduling under overlapping basic service set (OBSS) interference and dynamic traffic in dense Wi-Fi networks, this paper formulates cross-AP joint scheduling as a sequential decision-making problem—its first such formulation—and proposes a Proximal Policy Optimization (PPO)-based deep reinforcement learning framework. Implemented in the Gymnasium standard simulation environment, the framework integrates spatial reuse grouping and queue-state-aware mechanisms to derive adaptive, IEEE 802.11bn-compliant scheduling policies. Experiments across diverse traffic loads and patterns demonstrate that the approach reduces the 99th-percentile latency by 30% compared to the optimal baseline, significantly outperforming existing heuristic methods while simultaneously improving throughput and connection reliability. The core contributions are: (i) the first sequential decision model for MAPC scheduling, and (ii) an end-to-end learned scheduler achieving high robustness and ultra-low latency.
📝 Abstract
Multi-access point coordination (MAPC) is a key feature of IEEE 802.11bn, with a potential impact on future Wi-Fi networks. MAPC enables joint scheduling decisions across multiple access points (APs) to improve throughput, latency, and reliability in dense Wi-Fi deployments. However, implementing efficient scheduling policies under diverse traffic and interference conditions in overlapping basic service sets (OBSSs) remains a complex task. This paper presents a method to minimize the network-wide worst-case latency by formulating MAPC scheduling as a sequential decision-making problem and proposing a deep reinforcement learning (DRL) mechanism to minimize worst-case delays in OBSS deployments. Specifically, we train a DRL agent using proximal policy optimization (PPO) within an 802.11bn-compatible Gymnasium environment. This environment provides observations of queue states, delay metrics, and channel conditions, enabling the agent to schedule multiple AP-station pairs to transmit simultaneously by leveraging spatial reuse (SR) groups. Simulations demonstrate that our proposed solution outperforms state-of-the-art heuristic strategies across a wide range of network loads and traffic patterns. The trained machine learning (ML) models consistently achieve lower 99th-percentile delays, showing up to a 30% improvement over the best baseline.