🤖 AI Summary
This work addresses the joint optimization of unmanned aerial vehicle (UAV) three-dimensional trajectory, MIMO receive beamforming, ground node (GN) scheduling, and transmit power allocation in multi-antenna UAV-assisted IoT data collection, aiming to maximize the sum data collection (SDC). The problem is highly coupled and non-convex, rendering conventional convex optimization techniques inapplicable. To tackle this challenge, we propose a dual-loop optimization-driven deep reinforcement learning (DRL) framework and design an end-to-end fully DRL solution based on Rainbow DQN, enabling efficient coordinated decision-making over a continuous–discrete hybrid action space. Simulation results demonstrate that the proposed approach significantly enhances SDC in dynamic environments, achieving an average gain of 32.7% over benchmark algorithms. It exhibits both high computational efficiency and strong robustness, establishing a novel paradigm for non-convex joint resource optimization in UAV-assisted wireless networks.
📝 Abstract
Unmanned aerial vehicles (UAVs) assisted Internet of things (IoT) systems have become an important part of future wireless communications. To achieve higher communication rate, the joint design of UAV trajectory and resource allocation is crucial. This letter considers a scenario where a multi-antenna UAV is dispatched to simultaneously collect data from multiple ground IoT nodes (GNs) within a time interval. To improve the sum data collection (SDC) volume, i.e., the total data volume transmitted by the GNs, the UAV trajectory, the UAV receive beamforming, the scheduling of the GNs, and the transmit power of the GNs are jointly optimized. Since the problem is non-convex and the optimization variables are highly coupled, it is hard to solve using traditional optimization methods. To find a near-optimal solution, a double-loop structured optimization-driven deep reinforcement learning (DRL) algorithm and a fully DRL-based algorithm are proposed to solve the problem effectively. Simulation results verify that the proposed algorithms outperform two benchmarks with significant improvement in SDC volumes.