Joint Power Allocation and Phase Shift Design for Stacked Intelligent Metasurfaces-aided Cell-Free Massive MIMO Systems with MARL

📅 2025-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the energy efficiency bottleneck in cell-free massive MIMO systems, this paper pioneers the integration of stacked intelligent metasurfaces (SIMs) into the architecture and jointly optimizes access point power allocation and SIM phase responses to maximize sum spectral efficiency. We propose NVR-MAPPO, a novel multi-agent reinforcement learning algorithm that incorporates a noise-value regularization mechanism and recurrent neural network–based policies, enhancing exploration diversity and convergence robustness under a centralized training–decentralized execution framework. Compared with baseline methods, the proposed approach achieves significant gains in sum spectral efficiency across diverse user distributions and channel conditions, while simultaneously improving energy efficiency and demonstrating strong generalization and robustness. This work establishes a new paradigm for co-optimizing reconfigurable electromagnetic surfaces and wireless resource allocation.

Technology Category

Application Category

📝 Abstract
Cell-free (CF) massive multiple-input multiple-output (mMIMO) systems offer high spectral efficiency (SE) through multiple distributed access points (APs). However, the large number of antennas increases power consumption. We propose incorporating stacked intelligent metasurfaces (SIM) into CF mMIMO systems as a cost-effective, energy-efficient solution. This paper focuses on optimizing the joint power allocation of APs and the phase shift of SIMs to maximize the sum SE. To address this complex problem, we introduce a fully distributed multi-agent reinforcement learning (MARL) algorithm. Our novel algorithm, the noisy value method with a recurrent policy in multi-agent policy optimization (NVR-MAPPO), enhances performance by encouraging diverse exploration under centralized training and decentralized execution. Simulations demonstrate that NVR-MAPPO significantly improves sum SE and robustness across various scenarios.
Problem

Research questions and friction points this paper is trying to address.

Optimize power allocation in CF mMIMO
Design phase shifts for SIM integration
Maximize spectral efficiency with MARL algorithm
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stacked intelligent metasurfaces integration
Multi-agent reinforcement learning algorithm
Centralized training decentralized execution
🔎 Similar Papers
No similar papers found.
Yiyang Zhu
Yiyang Zhu
Nanyang Technological University
Wireless CommunicationMulti-Agent SystemsLarge Wireless ModelRISSIM
J
Jiayi Zhang
School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
E
Enyu Shi
School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
Ziheng Liu
Ziheng Liu
Beijing Jiaotong University
Cell-Free massive MIMOReinforcement learningSignal Processing
Chau Yuen
Chau Yuen
IEEE Fellow, Highly Cited Researcher, Nanyang Technological University
WirelessSmart GridLocalizationIoTBig Data
B
Bo Ai
School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China