Learning-Based Channel Access in Wi-Fi: A Multi-Armed Bandit Approach

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

career value

245K/year

🤖 AI Summary

To address the inefficiency of IEEE 802.11’s static channel access mechanism in dynamic wireless environments—leading to low spectral utilization and increased collisions—this paper proposes a decentralized reinforcement learning framework based on contextual multi-armed bandits (Contextual MAB). The framework jointly optimizes primary channel selection, channel bandwidth configuration, and contention window size via factorized action-space modeling and a multi-agent cooperative architecture, enabling implicit distributed coordination. Its key innovation lies in integrating context-aware decision-making into the Wi-Fi MAC layer and rigorously characterizing its critical impact on convergence speed and coexistence performance. Experiments demonstrate that the proposed method improves spectral efficiency by 37% and accelerates convergence by 2.1× over conventional approaches; context-aware modeling significantly outperforms non-contextual baselines; and while the multi-agent design ensures fair resource sharing, it necessitates explicit mitigation of local greedy behavior to avoid interference with heterogeneous networks.

Technology Category

Application Category

📝 Abstract

Due to its static protocol design, IEEE 802.11 (aka Wi-Fi) channel access lacks adaptability to address dynamic network conditions, resulting in inefficient spectrum utilization, unnecessary contention, and packet collisions. This paper investigates reinforcement learning (RL) solutions to optimize Wi-Fi's medium access control (MAC). In particular, a multi-armed bandit (MAB) framework is proposed for dynamic channel access (including both the primary channel and channel width) and contention window (CW) adjustment. In this setting, we study relevant learning design principles such as adopting joint or factorial action spaces (handled by a single agent (SA) and multiple agents (MA), respectively) and the importance of incorporating contextual information. Our simulation results show that cooperative MA architectures converge faster than their SA counterparts, as agents operate over smaller action spaces. Another key insight is that contextual MAB algorithms consistently outperform non-contextual ones, highlighting the value of leveraging side information in action selection. Moreover, in multi-player settings, results demonstrate that decentralized learners can achieve implicit coordination, although their greediness may degrade coexisting networks'performance and induce policy-chasing dynamics. Overall, these findings demonstrate that (contextual) MAB-based learning offers a practical and adaptive alternative to static IEEE 802.11 protocols, enabling more efficient and intelligent spectrum utilization.

Problem

Research questions and friction points this paper is trying to address.

Optimizing Wi-Fi channel access using reinforcement learning

Addressing static protocol limitations with multi-armed bandit framework

Improving spectrum utilization through contextual learning algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-armed bandit framework for Wi-Fi channel access

Contextual reinforcement learning for dynamic spectrum utilization

Cooperative multi-agent architecture for faster convergence

🔎 Similar Papers

Federated Deep Reinforcement Learning-Based Intelligent Channel Access in Dense Wi-Fi Deployments