Joint Power Allocation and Phase Shift Design for Stacked Intelligent Metasurfaces-aided Cell-Free Massive MIMO Systems with MARL

📅 2025-02-27

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

To address the energy efficiency bottleneck in cell-free massive MIMO systems, this paper pioneers the integration of stacked intelligent metasurfaces (SIMs) into the architecture and jointly optimizes access point power allocation and SIM phase responses to maximize sum spectral efficiency. We propose NVR-MAPPO, a novel multi-agent reinforcement learning algorithm that incorporates a noise-value regularization mechanism and recurrent neural network–based policies, enhancing exploration diversity and convergence robustness under a centralized training–decentralized execution framework. Compared with baseline methods, the proposed approach achieves significant gains in sum spectral efficiency across diverse user distributions and channel conditions, while simultaneously improving energy efficiency and demonstrating strong generalization and robustness. This work establishes a new paradigm for co-optimizing reconfigurable electromagnetic surfaces and wireless resource allocation.

Technology Category

Application Category

📝 Abstract

Cell-free (CF) massive multiple-input multiple-output (mMIMO) systems offer high spectral efficiency (SE) through multiple distributed access points (APs). However, the large number of antennas increases power consumption. We propose incorporating stacked intelligent metasurfaces (SIM) into CF mMIMO systems as a cost-effective, energy-efficient solution. This paper focuses on optimizing the joint power allocation of APs and the phase shift of SIMs to maximize the sum SE. To address this complex problem, we introduce a fully distributed multi-agent reinforcement learning (MARL) algorithm. Our novel algorithm, the noisy value method with a recurrent policy in multi-agent policy optimization (NVR-MAPPO), enhances performance by encouraging diverse exploration under centralized training and decentralized execution. Simulations demonstrate that NVR-MAPPO significantly improves sum SE and robustness across various scenarios.

Problem

Research questions and friction points this paper is trying to address.

Optimize power allocation in CF mMIMO

Design phase shifts for SIM integration

Maximize spectral efficiency with MARL algorithm

Innovation

Methods, ideas, or system contributions that make the work stand out.

Stacked intelligent metasurfaces integration

Multi-agent reinforcement learning algorithm

Centralized training decentralized execution

🔎 Similar Papers

Multi-User MISO with Stacked Intelligent Metasurfaces: A DRL-Based Sum-Rate Optimization Approach

2024-08-09arXiv.orgCitations: 6

💼 Related Jobs

Reinforcement Learning AI Engineer

Booz Allen Hamilton

$99,000.00 to $225,000.00 (annualized USD)

Remote / Hybrid / Onsite

Research Engineer, Monetization AI