MAGRPO: Accelerated MARL Training for Fluid Antenna-Assisted Wireless Network Optimization

📅 2026-04-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

200K/year
🤖 AI Summary
This study addresses the joint non-convex optimization of antenna placement, beamforming, and power allocation in fluid-antenna-assisted wireless networks without inter-base-station coordination. The problem is formulated as a decentralized partially observable Markov decision process (Dec-POMDP), and a multi-agent group-relative policy optimization algorithm (MAGRPO) is proposed under the centralized training with decentralized execution (CTDE) paradigm. By replacing the conventional critic network with a group-relative advantage estimator and leveraging parameter sharing, MAGRPO reduces computational complexity by nearly 50% and derives an upper bound on the variance of cumulative rewards. Experimental results demonstrate that fluid-antenna networks achieve several-fold improvements in sum rate over fixed-antenna systems, and MAGRPO attains performance comparable to MAPPO while reducing training time by 30%–40%.

Technology Category

Application Category

📝 Abstract
Fluid antenna system (FAS) becomes a promising paradigm for next-generation wireless networks, which enables position-flexible antenna elements that can dynamically adjust to more favorable channel conditions. However, the optimization of fluid antenna (FA) positions, beamforming, and power allocation in FA-assisted wireless networks is challenging, due to the non-convexity and the lack of base station (BS) coordination. In this paper, we first formulate this challenging optimization problem as a decentralized partially observable Markov decision process, and then propose a multi-agent group relative policy optimization (MAGRPO) algorithm under the centralized training decentralized execution (CTDE) paradigm. Compared with multi-agent proximal policy optimization (MAPPO), MAGRPO replaces the critic network with group relative advantage estimation. This design reduces computational complexity by nearly half under parameter sharing. Furthermore, we derive a variance upper bound of the cumulative reward, which scales with network parameters, e.g., the number of BSs, users, and FAs. Simulation results show that compared with wireless networks with fixed antenna positions, FA-assisted wireless networks achieve multiple-fold sum-rate enhancement. Moreover, the proposed MAGRPO attains sum-rates comparable to those of MAPPO in testing, while reducing training time by $30\% \sim 40\%$.
Problem

Research questions and friction points this paper is trying to address.

fluid antenna system
wireless network optimization
multi-agent reinforcement learning
non-convex optimization
base station coordination
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fluid Antenna System
Multi-Agent Reinforcement Learning
Group Relative Policy Optimization
Decentralized Optimization
Centralized Training Decentralized Execution
🔎 Similar Papers
No similar papers found.
W
Wanzhe Wang
Harbin Institute of Technology, Shenzhen, China
T
Tong Zhang
Guangdong Provincial Key Laboratory of Aerospace Communication and Networking Technology, Harbin Institute of Technology, Shenzhen, 518055, China
Hao Xu
Hao Xu
Southeast University
Wireless communicationmathematical optimizationinformation theoryMIMO systems
Shuai Wang
Shuai Wang
Shenzhen Institutes of Advanced Technology
autonomous systemswireless communications
R
Rui Wang
Southern University of Science and Technology
K
Kai-Kit Wong
Department of Electronic and Electrical Engineering, University College London, Torrington Place, WC1E 7JE, United Kingdom; and Department of Electronic Engineering, Kyung Hee University, Yongin-si, Gyeonggi-do 17104, Korea