🤖 AI Summary
To address low spatial reuse (SR) efficiency and poor fairness in multi-AP coexistence scenarios envisioned for Wi-Fi 8 and beyond, this paper proposes a distributed online learning framework based on multi-agent multi-armed bandits (MA-MAB). It is the first to jointly optimize packet detection (PD) threshold adaptation and transmit power control within an MA-MAB paradigm. A decentralized reward-sharing mechanism enables AI-native, coordinator-free dynamic SR optimization. Evaluated on the Komondor simulation platform, the proposed approach achieves a 15% average throughput gain, a 210% improvement in minimum network throughput, and a maximum access delay of ≤3 ms. These results demonstrate substantial gains in spectral efficiency and user fairness under dense multi-AP deployments.
📝 Abstract
Multi-Access Point Coordination (MAPC) and Artificial Intelligence and Machine Learning (AI/ML) are expected to be key features in future Wi-Fi, such as the forthcoming IEEE 802.11bn (Wi-Fi~8) and beyond. In this paper, we explore a coordinated solution based on online learning to drive the optimization of Spatial Reuse (SR), a method that allows multiple devices to perform simultaneous transmissions by controlling interference through Packet Detect (PD) adjustment and transmit power control. In particular, we focus on a Multi-Agent Multi-Armed Bandit (MA-MAB) setting, where multiple decision-making agents concurrently configure SR parameters from coexisting networks by leveraging the MAPC framework, and study various algorithms and reward-sharing mechanisms. We evaluate different MA-MAB implementations using Komondor, a well-adopted Wi-Fi simulator, and demonstrate that AI-native SR enabled by coordinated MABs can improve the network performance over current Wi-Fi operation: mean throughput increases by 15%, fairness is improved by increasing the minimum throughput across the network by 210%, while the maximum access delay is kept below 3 ms.