🤖 AI Summary
This work addresses the high overhead of channel state information (CSI) estimation that hinders practical deployment of reconfigurable intelligent surfaces (RIS). The authors propose a CSI-free, AI-native approach that integrates multi-agent reinforcement learning (MARL) with spatial abstraction for the first time. Leveraging a centralized training with decentralized execution (CTDE) framework and the MAPPO algorithm, the method controls a mechanically tunable metallic reflectarray by mapping high-dimensional mechanical constraints onto a low-dimensional virtual focal space, enabling user-coordinate-driven cooperative beam focusing. Bypassing conventional CSI dependency, the approach achieves superior performance in non-line-of-sight dynamic environments—delivering up to 26.86 dB gain over static reflectors and significantly outperforming single-agent and hardware-constrained baselines. It maintains stable coverage even under 1-meter user localization noise, demonstrating strong spatial selectivity and temporal robustness.
📝 Abstract
Reconfigurable Intelligent Surfaces (RIS) are pivotal for next-generation smart radio environments, yet their practical deployment is severely bottlenecked by the intractable computational overhead of Channel State Information (CSI) estimation. To bypass this fundamental physical-layer barrier, we propose an AI-native, data-driven paradigm that replaces complex channel modeling with spatial intelligence. This paper presents a fully autonomous Multi-Agent Reinforcement Learning (MARL) framework to control mechanically adjustable metallic reflector arrays. By mapping high-dimensional mechanical constraints to a reduced-order virtual focal point space, we deploy a Centralized Training with Decentralized Execution (CTDE) architecture. Using Multi-Agent Proximal Policy Optimization (MAPPO), our decentralized agents learn cooperative beam-focusing strategies relying on user coordinates, achieving CSI-free operation. High-fidelity ray-tracing simulations in dynamic non-line-of-sight (NLOS) environments demonstrate that this multi-agent approach rapidly adapts to user mobility, yielding up to a 26.86 dB enhancement over static flat reflectors and outperforming single-agent and hardware-constrained DRL baselines in both spatial selectivity and temporal stability. Crucially, the learned policies exhibit good deployment resilience, sustaining stable signal coverage even under 1.0-meter localization noise. These results validate the efficacy of MARL-driven spatial abstractions as a scalable, highly practical pathway toward AI-empowered wireless networks.