MAGRPO: Accelerated MARL Training for Fluid Antenna-Assisted Wireless Network Optimization

📅 2026-04-19

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This study addresses the joint non-convex optimization of antenna placement, beamforming, and power allocation in fluid-antenna-assisted wireless networks without inter-base-station coordination. The problem is formulated as a decentralized partially observable Markov decision process (Dec-POMDP), and a multi-agent group-relative policy optimization algorithm (MAGRPO) is proposed under the centralized training with decentralized execution (CTDE) paradigm. By replacing the conventional critic network with a group-relative advantage estimator and leveraging parameter sharing, MAGRPO reduces computational complexity by nearly 50% and derives an upper bound on the variance of cumulative rewards. Experimental results demonstrate that fluid-antenna networks achieve several-fold improvements in sum rate over fixed-antenna systems, and MAGRPO attains performance comparable to MAPPO while reducing training time by 30%–40%.

Technology Category

Application Category

📝 Abstract

Fluid antenna system (FAS) becomes a promising paradigm for next-generation wireless networks, which enables position-flexible antenna elements that can dynamically adjust to more favorable channel conditions. However, the optimization of fluid antenna (FA) positions, beamforming, and power allocation in FA-assisted wireless networks is challenging, due to the non-convexity and the lack of base station (BS) coordination. In this paper, we first formulate this challenging optimization problem as a decentralized partially observable Markov decision process, and then propose a multi-agent group relative policy optimization (MAGRPO) algorithm under the centralized training decentralized execution (CTDE) paradigm. Compared with multi-agent proximal policy optimization (MAPPO), MAGRPO replaces the critic network with group relative advantage estimation. This design reduces computational complexity by nearly half under parameter sharing. Furthermore, we derive a variance upper bound of the cumulative reward, which scales with network parameters, e.g., the number of BSs, users, and FAs. Simulation results show that compared with wireless networks with fixed antenna positions, FA-assisted wireless networks achieve multiple-fold sum-rate enhancement. Moreover, the proposed MAGRPO attains sum-rates comparable to those of MAPPO in testing, while reducing training time by $30\% \sim 40\%$.

Problem

Research questions and friction points this paper is trying to address.

fluid antenna system

wireless network optimization

multi-agent reinforcement learning

non-convex optimization

base station coordination

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fluid Antenna System

Multi-Agent Reinforcement Learning

Group Relative Policy Optimization