🤖 AI Summary
This paper addresses the joint UAV trajectory planning and user scheduling problem in UAV-assisted IoT networks, aiming to minimize the average Age of Information (AoI) while satisfying long-term energy constraints. To overcome limitations of standard Decision Transformers (DTs)—including inflexibility to dynamic user scales, inadequate modeling of long-term energy dynamics, and poor few-shot transferability—we propose an enhanced DT framework. Our contributions are threefold: (1) a variable-length attention mechanism enabling adaptive handling of varying numbers of users; (2) a prompt-learning module grounded in short trajectory demonstrations, facilitating rapid, annotation-free adaptation to new scenarios; and (3) a tokenized energy constraint modeling module for precise long-term energy regulation. Evaluated under an offline pretraining plus lightweight fine-tuning paradigm, our method achieves 2× faster convergence and an 8% reduction in average AoI, significantly outperforming baseline DTs and conventional optimization approaches.
📝 Abstract
Decision Transformer (DT) has recently demonstrated strong generalizability in dynamic resource allocation within unmanned aerial vehicle (UAV) networks, compared to conventional deep reinforcement learning (DRL). However, its performance is hindered due to zero-padding for varying state dimensions, inability to manage long-term energy constraint, and challenges in acquiring expert samples for few-shot fine-tuning in new scenarios. To overcome these limitations, we propose an attention-enhanced prompt Decision Transformer (APDT) framework to optimize trajectory planning and user scheduling, aiming to minimize the average age of information (AoI) under long-term energy constraint in UAV-assisted Internet of Things (IoT) networks. Specifically, we enhance the convenional DT framework by incorporating an attention mechanism to accommodate varying numbers of terrestrial users, introducing a prompt mechanism based on short trajectory demonstrations for rapid adaptation to new scenarios, and designing a token-assisted method to address the UAV's long-term energy constraint. The APDT framework is first pre-trained on offline datasets and then efficiently generalized to new scenarios. Simulations demonstrate that APDT achieves twice faster in terms of convergence rate and reduces average AoI by $8%$ compared to conventional DT.