🤖 AI Summary
This work addresses the challenge of degraded coordination efficiency and incomplete state awareness in multi-UAV systems operating under communication-constrained environments, where information delays hinder throughput maximization. To overcome these limitations, the authors propose a delay-tolerant multi-agent deep reinforcement learning (MADRL) algorithm that integrates a spatiotemporal attention mechanism to predict missing states and jointly optimizes trajectory planning, network topology, and transmission control policies. By incorporating a delay-penalized reward function to encourage effective information sharing, the proposed method achieves a 75% increase in system throughput and reduces information delay by over 50%, all while significantly lowering communication overhead. These results demonstrate clear advantages over existing MADRL approaches and enhanced practicality for real-world deployment.
📝 Abstract
In this paper, we employ multiple UAVs to accelerate data transmissions from ground users (GUs) to a remote base station (BS) via the UAVs' relay communications. The UAVs' intermittent information exchanges typically result in delays in acquiring the complete system state and hinder their effective collaboration. To maximize the overall throughput, we first propose a delay-tolerant multi-agent deep reinforcement learning (MADRL) algorithm that integrates a delay-penalized reward to encourage information sharing among UAVs, while jointly optimizing the UAVs' trajectory planning, network formation, and transmission control strategies. Additionally, considering information loss due to unreliable channel conditions, we further propose a spatio-temporal attention based prediction approach to recover the lost information and enhance each UAV's awareness of the network state. These two designs are envisioned to enhance the network capacity in UAV-assisted wireless networks with limited communications. The simulation results reveal that our new approach achieves over 50\% reduction in information delay and 75% throughput gain compared to the conventional MADRL. Interestingly, it is shown that improving the UAVs' information sharing will not sacrifice the network capacity. Instead, it significantly improves the learning performance and throughput simultaneously. It is also effective in reducing the need for UAVs' information exchange and thus fostering practical deployment of MADRL in UAV-assisted wireless networks.