Learning-Based Channel Access in Wi-Fi: A Multi-Armed Bandit Approach

📅 2025-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the inefficiency of IEEE 802.11’s static channel access mechanism in dynamic wireless environments—leading to low spectral utilization and increased collisions—this paper proposes a decentralized reinforcement learning framework based on contextual multi-armed bandits (Contextual MAB). The framework jointly optimizes primary channel selection, channel bandwidth configuration, and contention window size via factorized action-space modeling and a multi-agent cooperative architecture, enabling implicit distributed coordination. Its key innovation lies in integrating context-aware decision-making into the Wi-Fi MAC layer and rigorously characterizing its critical impact on convergence speed and coexistence performance. Experiments demonstrate that the proposed method improves spectral efficiency by 37% and accelerates convergence by 2.1× over conventional approaches; context-aware modeling significantly outperforms non-contextual baselines; and while the multi-agent design ensures fair resource sharing, it necessitates explicit mitigation of local greedy behavior to avoid interference with heterogeneous networks.

Technology Category

Application Category

📝 Abstract
Due to its static protocol design, IEEE 802.11 (aka Wi-Fi) channel access lacks adaptability to address dynamic network conditions, resulting in inefficient spectrum utilization, unnecessary contention, and packet collisions. This paper investigates reinforcement learning (RL) solutions to optimize Wi-Fi's medium access control (MAC). In particular, a multi-armed bandit (MAB) framework is proposed for dynamic channel access (including both the primary channel and channel width) and contention window (CW) adjustment. In this setting, we study relevant learning design principles such as adopting joint or factorial action spaces (handled by a single agent (SA) and multiple agents (MA), respectively) and the importance of incorporating contextual information. Our simulation results show that cooperative MA architectures converge faster than their SA counterparts, as agents operate over smaller action spaces. Another key insight is that contextual MAB algorithms consistently outperform non-contextual ones, highlighting the value of leveraging side information in action selection. Moreover, in multi-player settings, results demonstrate that decentralized learners can achieve implicit coordination, although their greediness may degrade coexisting networks'performance and induce policy-chasing dynamics. Overall, these findings demonstrate that (contextual) MAB-based learning offers a practical and adaptive alternative to static IEEE 802.11 protocols, enabling more efficient and intelligent spectrum utilization.
Problem

Research questions and friction points this paper is trying to address.

Optimizing Wi-Fi channel access using reinforcement learning
Addressing static protocol limitations with multi-armed bandit framework
Improving spectrum utilization through contextual learning algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-armed bandit framework for Wi-Fi channel access
Contextual reinforcement learning for dynamic spectrum utilization
Cooperative multi-agent architecture for faster convergence
🔎 Similar Papers
No similar papers found.