🤖 AI Summary
To address resource constraints, low search efficiency, and training instability in multi-UAV trajectory planning for wireless power transfer (WPT)-assisted IoT systems, this paper proposes AUTO—a novel framework comprising two core components. First, we design ATOM (Attention-based Trajectory Optimization Model), a graph Transformer-based architecture that pioneers the application of graph self-attention to jointly model UAV trajectories. Second, we introduce TENMA, a tailored training methodology that enhances the Actor-Critic framework with a variance-reduction strategy grounded in real-system reward signals, significantly improving stability and scalability for large-scale collaborative planning. Extensive simulations and hardware-in-the-loop experiments demonstrate that AUTO substantially outperforms baseline methods in both energy transfer efficiency and IoT node wake-up rate. The framework achieves high accuracy, low computational overhead, and strong scalability—making it suitable for practical WPT-enabled IoT deployments.
📝 Abstract
Unmanned Aerial Vehicles (UAVs) in Wireless Power Transfer (WPT)-assisted Internet of Things (IoT) systems face the following challenges: limited resources and suboptimal trajectory planning. Reinforcement learning-based trajectory planning schemes face issues of low search efficiency and learning instability when optimizing large-scale systems. To address these issues, we present an Attention-based UAV Trajectory Optimization (AUTO) framework based on the graph transformer, which consists of an Attention Trajectory Optimization Model (ATOM) and a Trajectory lEarNing Method based on Actor-critic (TENMA). In ATOM, a graph encoder is used to calculate the self-attention characteristics of all IoTDs, and a trajectory decoder is developed to optimize the number and trajectories of UAVs. TENMA then trains the ATOM using an improved Actor-Critic method, in which the real reward of the system is applied as the baseline to reduce variances in the critic network. This method is suitable for high-quality and large-scale multi-UAV trajectory planning. Finally, we develop numerous experiments, including a hardware experiment in the field case, to verify the feasibility and efficiency of the AUTO framework.