Fair Dynamic Spectrum Access via Fully Decentralized Multi-Agent Reinforcement Learning

📅 2025-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In decentralized wireless networks, multi-source–destination pairs face a fundamental trade-off between throughput maximization and fairness assurance when autonomously learning spectrum access policies over limited orthogonal frequency bands. To address this, we propose FSRL—the first coordination-free, fully decentralized reinforcement learning framework for spectrum access. FSRL introduces three key innovations: (i) semi-adaptive temporal state augmentation, (ii) risk-aware temporal modeling, and (iii) a fairness-driven reward mechanism grounded in the Jain fairness index. Furthermore, it integrates collision-feedback-guided state learning with a time-difference likelihood architecture to ensure robust decision-making. Evaluated across 50+ heterogeneous scenarios, FSRL achieves up to an 89.0% fairness improvement over state-of-the-art baselines—particularly under stringent single-band, multi-user conditions—and yields an average fairness gain of 48.1%, while simultaneously enhancing individual throughput and system-wide fairness.

Technology Category

Application Category

📝 Abstract
We consider a decentralized wireless network with several source-destination pairs sharing a limited number of orthogonal frequency bands. Sources learn to adapt their transmissions (specifically, their band selection strategy) over time, in a decentralized manner, without sharing information with each other. Sources can only observe the outcome of their own transmissions (i.e., success or collision), having no prior knowledge of the network size or of the transmission strategy of other sources. The goal of each source is to maximize their own throughput while striving for network-wide fairness. We propose a novel fully decentralized Reinforcement Learning (RL)-based solution that achieves fairness without coordination. The proposed Fair Share RL (FSRL) solution combines: (i) state augmentation with a semi-adaptive time reference; (ii) an architecture that leverages risk control and time difference likelihood; and (iii) a fairness-driven reward structure. We evaluate FSRL in more than 50 network settings with different number of agents, different amounts of available spectrum, in the presence of jammers, and in an ad-hoc setting. Simulation results suggest that, when we compare FSRL with a common baseline RL algorithm from the literature, FSRL can be up to 89.0% fairer (as measured by Jain's fairness index) in stringent settings with several sources and a single frequency band, and 48.1% fairer on average.
Problem

Research questions and friction points this paper is trying to address.

Decentralized spectrum access without coordination among agents
Maximize throughput while ensuring network-wide fairness
Adaptive band selection with limited observation and no prior knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fully decentralized multi-agent reinforcement learning
State augmentation with semi-adaptive time reference
Fairness-driven reward structure with risk control
🔎 Similar Papers
No similar papers found.
Y
Yubo Zhang
Department of Electrical and Computer Engineering, Northwestern University, USA
P
Pedro Botelho
Department of Electrical and Computer Engineering, Northwestern University, USA; Center of Excellence in Artificial Intelligence (CEIA), Brazil
T
Trevor Gordon
Department of Electrical Engineering, Columbia University, USA
Gil Zussman
Gil Zussman
Columbia University
Wireless NetworksMobile NetworksComputer NetworksNetwork Resilience
Igor Kadota
Igor Kadota
Assistant Professor at Northwestern University
Wireless NetworksComputer NetworksNetwork OptimizationTestbeds