Group Relative Policy Optimization for Robust Blind Interference Alignment with Fluid Antennas

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the sum-rate optimization problem for K-user MISO downlink systems under imperfect channel state information by introducing, for the first time, a robust transmission framework that integrates fluid antennas with blind interference alignment. The authors propose a critic-free Group Relative Policy Optimization (GRPO) algorithm, which leverages population-based relative policy exploration to effectively avoid local optima and jointly optimizes antenna positions and beamforming vectors. Experimental results demonstrate that GRPO achieves a 4.17% sum-rate improvement over standard PPO and a 30.29% gain over pre-trained PPO. Moreover, it significantly outperforms MaximumGain and RandomGain baselines by 200.78% and 465.38%, respectively, while reducing both model size and computational overhead by nearly 50%.

Technology Category

Application Category

📝 Abstract
Fluid antenna system (FAS) leverages dynamic reconfigurability to unlock spatial degrees of freedom and reshape wireless channels. Blind interference alignment (BIA) aligns interference through antenna switching. This paper proposes, for the first time, a robust fluid antenna-driven BIA framework for a K-user MISO downlink under imperfect channel state information (CSI). We formulate a robust sum-rate maximization problem through optimizing fluid antenna positions (switching positions). To solve this challenging non-convex problem, we employ group relative policy optimization (GRPO), a novel deep reinforcement learning algorithm that eliminates the critic network. This robust design reduces model size and floating point operations (FLOPs) by nearly half compared to proximal policy optimization (PPO) while significantly enhancing performance through group-based exploration that escapes bad local optima. Simulation results demonstrate that GRPO outperforms PPO by 4.17%, and a 100K-step pre-trained PPO by 30.29%. Due to error distribution learning, GRPO exceeds heuristic MaximumGain and RandomGain by 200.78% and 465.38%, respectively.
Problem

Research questions and friction points this paper is trying to address.

blind interference alignment
fluid antenna system
imperfect CSI
robust sum-rate maximization
MISO downlink
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fluid Antenna System
Blind Interference Alignment
Group Relative Policy Optimization
Robust Sum-Rate Maximization
Deep Reinforcement Learning
🔎 Similar Papers
No similar papers found.
J
Jianqiu Peng
School of Information Science and Technology, Harbin Institute of Technology, Shenzhen, China
T
Tong Zhang
Guangdong Provincial Key Laboratory of Aerospace Communication and Networking Technology, Harbin Institute of Technology, Shenzhen, 518055, China
Shuai Wang
Shuai Wang
Shenzhen Institutes of Advanced Technology
autonomous systemswireless communications
Mingjie Shao
Mingjie Shao
Academy of Mathematics and Systems Science, Chinese Academy of Sciences
signal processingwireless communicationoptimizationmachine learning
Hao Xu
Hao Xu
Southeast University
Wireless communicationmathematical optimizationinformation theoryMIMO systems
R
Rui Wang
Southern University of Science and Technology