Learning to Focus: CSI-Free Hierarchical MARL for Reconfigurable Reflectors

📅 2026-04-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high channel state information (CSI) estimation overhead and the curse of dimensionality in centralized optimization for large-scale millimeter-wave networks by proposing a CSI-free hierarchical multi-agent reinforcement learning architecture. The approach replaces conventional channel estimation with user location information and employs a two-level controller hierarchy to cooperatively manage mechanical reconfigurable intelligent surfaces (RIS) for efficient beam focusing. By innovatively integrating spatial intelligence with hierarchical decision-making, the method eliminates CSI acquisition overhead while preserving focusing accuracy. Implemented within a MAPPO framework leveraging centralized training with decentralized execution (CTDE), the system achieves up to a 7.79 dB gain in received signal strength over centralized baselines in ray-tracing simulations, demonstrating excellent scalability to multiple users and robustness under sub-meter-level localization errors.
📝 Abstract
Reconfigurable Intelligent Surfaces (RIS) has a potential to engineer smart radio environments for next-generation millimeter-wave (mmWave) networks. However, the prohibitive computational overhead of Channel State Information (CSI) estimation and the dimensionality explosion inherent in centralized optimization severely hinder practical large-scale deployments. To overcome these bottlenecks, we introduce a ``CSI-free" paradigm powered by a Hierarchical Multi-Agent Reinforcement Learning (HMARL) architecture to control mechanically reconfigurable reflective surfaces. By substituting pilot-based channel estimation with accessible user localization data, our framework leverages spatial intelligence for macro-scale wave propagation management. The control problem is decomposed into a two-tier neural architecture: a high-level controller executes temporally extended, discrete user-to-reflector allocations, while low-level controllers autonomously optimize continuous focal points utilizing Multi-Agent Proximal Policy Optimization (MAPPO) under a Centralized Training with Decentralized Execution (CTDE) scheme. Comprehensive deterministic ray-tracing evaluations demonstrate that this hierarchical framework achieves massive RSSI improvements of up to 7.79 dB over centralized baselines. Furthermore, the system exhibits robust multi-user scalability and maintains highly resilient beam-focusing performance under practical sub-meter localization tracking errors. By eliminating CSI overhead while maintaining high-fidelity signal redirection, this work establishes a scalable and cost-effective blueprint for intelligent wireless environments.
Problem

Research questions and friction points this paper is trying to address.

Reconfigurable Intelligent Surfaces
Channel State Information
millimeter-wave networks
computational overhead
dimensionality explosion
Innovation

Methods, ideas, or system contributions that make the work stand out.

CSI-free
Hierarchical MARL
Reconfigurable Intelligent Surfaces
Spatial Intelligence
CTDE
Hieu Le
Hieu Le
UNC-Charlotte; EPFL; Stony Brook University
Computer VisionMachine Learning
Mostafa Ibrahim
Mostafa Ibrahim
Nutrien Ltd
Soil Science & Agronomy
O
Oguz Bedir
Electrical and Computer Engineering, Texas A&M University, College Station, Texas, USA
Jian Tao
Jian Tao
Texas A&M University
digital twinmachine learningdeep learninghigh performance computingdata science
S
Sabit Ekin
Engineering Technology, and Electrical and Computer Engineering, Texas A&M University, College Station, Texas, USA