Deep Reinforcement Learning-Based Scheduling for Wi-Fi Multi-Access Point Coordination

📅 2025-07-25

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

To address performance bottlenecks in multi-access-point (MAPC) coordinated scheduling under overlapping basic service set (OBSS) interference and dynamic traffic in dense Wi-Fi networks, this paper formulates cross-AP joint scheduling as a sequential decision-making problem—its first such formulation—and proposes a Proximal Policy Optimization (PPO)-based deep reinforcement learning framework. Implemented in the Gymnasium standard simulation environment, the framework integrates spatial reuse grouping and queue-state-aware mechanisms to derive adaptive, IEEE 802.11bn-compliant scheduling policies. Experiments across diverse traffic loads and patterns demonstrate that the approach reduces the 99th-percentile latency by 30% compared to the optimal baseline, significantly outperforming existing heuristic methods while simultaneously improving throughput and connection reliability. The core contributions are: (i) the first sequential decision model for MAPC scheduling, and (ii) an end-to-end learned scheduler achieving high robustness and ultra-low latency.

Technology Category

Application Category

📝 Abstract

Multi-access point coordination (MAPC) is a key feature of IEEE 802.11bn, with a potential impact on future Wi-Fi networks. MAPC enables joint scheduling decisions across multiple access points (APs) to improve throughput, latency, and reliability in dense Wi-Fi deployments. However, implementing efficient scheduling policies under diverse traffic and interference conditions in overlapping basic service sets (OBSSs) remains a complex task. This paper presents a method to minimize the network-wide worst-case latency by formulating MAPC scheduling as a sequential decision-making problem and proposing a deep reinforcement learning (DRL) mechanism to minimize worst-case delays in OBSS deployments. Specifically, we train a DRL agent using proximal policy optimization (PPO) within an 802.11bn-compatible Gymnasium environment. This environment provides observations of queue states, delay metrics, and channel conditions, enabling the agent to schedule multiple AP-station pairs to transmit simultaneously by leveraging spatial reuse (SR) groups. Simulations demonstrate that our proposed solution outperforms state-of-the-art heuristic strategies across a wide range of network loads and traffic patterns. The trained machine learning (ML) models consistently achieve lower 99th-percentile delays, showing up to a 30% improvement over the best baseline.

Problem

Research questions and friction points this paper is trying to address.

Minimizing worst-case latency in Wi-Fi MAPC scheduling

Optimizing AP coordination under diverse traffic conditions

Improving delay performance using DRL in OBSS deployments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep reinforcement learning for Wi-Fi scheduling

Proximal policy optimization in Gymnasium environment

Spatial reuse groups for multi-AP coordination

🔎 Similar Papers

Federated Deep Reinforcement Learning-Based Intelligent Channel Access in Dense Wi-Fi Deployments