🤖 AI Summary
This work addresses the joint optimization of beamforming, RIS phase shifts, and UAV three-dimensional positioning in STAR-RIS-aided UAV communication systems to maximize the sum rate. To tackle this challenging problem—characterized by a continuous, high-dimensional, and heterogeneous action space—we propose a Convolution-Augmented Deep Deterministic Policy Gradient (CA-DDPG) algorithm. Specifically, CA-DDPG incorporates a stochastic perturbation mechanism to enhance exploration and integrates convolutional layers into the critic network to explicitly capture spatial correlations in the state-action value function. Simulation results demonstrate that the proposed method significantly outperforms baseline algorithms across diverse channel conditions, achieving higher sum rates and superior robustness. These findings validate CA-DDPG’s effectiveness and advancement in dynamic air-ground-space cooperative resource optimization for next-generation wireless networks.
📝 Abstract
With the rapid advances in programmable materials, reconfigurable intelligent surfaces (RIS) have become a pivotal technology for future wireless communications. The simultaneous transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) can both transmit and reflect signals, enabling comprehensive signal control and expanding application scenarios. This paper introduces an unmanned aerial vehicle (UAV) to further enhance system flexibility and proposes an optimization design for the spectrum efficiency of the STAR-RIS-UAV-assisted wireless communication system. We present a deep reinforcement learning (DRL) algorithm capable of iteratively optimizing beamforming, phase shifts, and UAV positioning to maximize the system's sum rate through continuous interactions with the environment. To improve exploration in deterministic policies, we introduce a stochastic perturbation factor, which enhances exploration capabilities. As exploration is strengthened, the algorithm's ability to accurately evaluate the state-action value function becomes critical. Thus, based on the deep deterministic policy gradient (DDPG) algorithm, we propose a convolution-augmented deep deterministic policy gradient (CA-DDPG) algorithm that balances exploration and evaluation to improve the system's sum rate. The simulation results demonstrate that the CA-DDPG algorithm effectively interacts with the environment, optimizing the beamforming matrix, phase shift matrix, and UAV location, thereby improving system capacity and achieving better performance than other algorithms.