Group Relative Policy Optimization for Robust Blind Interference Alignment with Fluid Antennas

📅 2026-01-20

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

This study addresses the sum-rate optimization problem for K-user MISO downlink systems under imperfect channel state information by introducing, for the first time, a robust transmission framework that integrates fluid antennas with blind interference alignment. The authors propose a critic-free Group Relative Policy Optimization (GRPO) algorithm, which leverages population-based relative policy exploration to effectively avoid local optima and jointly optimizes antenna positions and beamforming vectors. Experimental results demonstrate that GRPO achieves a 4.17% sum-rate improvement over standard PPO and a 30.29% gain over pre-trained PPO. Moreover, it significantly outperforms MaximumGain and RandomGain baselines by 200.78% and 465.38%, respectively, while reducing both model size and computational overhead by nearly 50%.

Technology Category

Application Category

📝 Abstract

Fluid antenna system (FAS) leverages dynamic reconfigurability to unlock spatial degrees of freedom and reshape wireless channels. Blind interference alignment (BIA) aligns interference through antenna switching. This paper proposes, for the first time, a robust fluid antenna-driven BIA framework for a K-user MISO downlink under imperfect channel state information (CSI). We formulate a robust sum-rate maximization problem through optimizing fluid antenna positions (switching positions). To solve this challenging non-convex problem, we employ group relative policy optimization (GRPO), a novel deep reinforcement learning algorithm that eliminates the critic network. This robust design reduces model size and floating point operations (FLOPs) by nearly half compared to proximal policy optimization (PPO) while significantly enhancing performance through group-based exploration that escapes bad local optima. Simulation results demonstrate that GRPO outperforms PPO by 4.17%, and a 100K-step pre-trained PPO by 30.29%. Due to error distribution learning, GRPO exceeds heuristic MaximumGain and RandomGain by 200.78% and 465.38%, respectively.

Problem

Research questions and friction points this paper is trying to address.

blind interference alignment

fluid antenna system

imperfect CSI

robust sum-rate maximization

MISO downlink

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fluid Antenna System

Blind Interference Alignment

Group Relative Policy Optimization