Consensus-based Decentralized Multi-agent Reinforcement Learning for Random Access Network Optimization

📅 2025-08-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address frequent packet collisions and channel access unfairness in random-access networks, this paper proposes a fully decentralized multi-agent reinforcement learning (MARL)-based MAC protocol. Methodologically, it abandons centralized training and instead introduces a distributed Actor-Critic architecture grounded in local reward exchange and consensus-based policy updates; agents share only scalar rewards—not states or actions—thereby substantially reducing communication overhead. A consensus mechanism ensures policy convergence without reliance on a central coordinator. The key contribution is the first integration of a lightweight reward consensus mechanism into decentralized MARL-based MAC design, enabling efficient cooperative optimization under partial observability. Experiments demonstrate that the proposed protocol reduces collision rate by 23.6% and improves fairness—measured by Jain’s index—by 31.4% over state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract
With wireless devices increasingly forming a unified smart network for seamless, user-friendly operations, random access (RA) medium access control (MAC) design is considered a key solution for handling unpredictable data traffic from multiple terminals. However, it remains challenging to design an effective RA-based MAC protocol to minimize collisions and ensure transmission fairness across the devices. While existing multi-agent reinforcement learning (MARL) approaches with centralized training and decentralized execution (CTDE) have been proposed to optimize RA performance, their reliance on centralized training and the significant overhead required for information collection can make real-world applications unrealistic. In this work, we adopt a fully decentralized MARL architecture, where policy learning does not rely on centralized tasks but leverages consensus-based information exchanges across devices. We design our MARL algorithm over an actor-critic (AC) network and propose exchanging only local rewards to minimize communication overhead. Furthermore, we provide a theoretical proof of global convergence for our approach. Numerical experiments show that our proposed MARL algorithm can significantly improve RA network performance compared to other baselines.
Problem

Research questions and friction points this paper is trying to address.

Design decentralized MAC protocol to minimize collisions and ensure fairness
Overcome centralized training limitations in multi-agent reinforcement learning
Reduce communication overhead with local reward exchanges
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fully decentralized MARL architecture
Consensus-based local reward exchanges
Actor-critic network with global convergence
🔎 Similar Papers
No similar papers found.