π€ AI Summary
To address insufficient autonomy and adaptability in multi-UAV cooperative trajectory optimization for low-altitude economic networks, this paper proposes ARMAITβa unified end-to-end framework. First, it employs Agentic Retrieval-Augmented Generation (Agentic RAG) to autonomously parse task requirements. Second, it introduces a Mamba-Attention hybrid architecture (MAIT) that jointly optimizes long-range dependency modeling efficiency and fine-grained local feature capture. Third, it formulates Trajectory Group Relative Policy Optimization (T-GRPO), a novel algorithm unifying discrete task assignment and continuous trajectory control within a single joint policy space. Experiments across diverse-scale multi-UAV systems demonstrate that ARMAIT significantly improves planning efficiency, robustness, and generalization capability compared to state-of-the-art baselines. The framework establishes a scalable, interpretable, and end-to-end decision-making paradigm for low-altitude intelligent traffic management.
π Abstract
This paper proposes a novel Agentic Retrieval-augmented generation with Mamba-Attention Integrated Transformer (ARMAIT) framework for multi-Unmanned Aerial Vehicle (UAV) trajectory optimization. The framework is built upon Large Language Models (LLMs), incorporating Retrieval-Augmented Generation (RAG) empowered by Agentic AI and integrated with a UAV-specific knowledge base. Through the Agentic RAG, the LLM autonomously interprets high-level task requirements and identifies the key components necessary for trajectory optimization, including model inputs and outputs, network architecture, reward functions, and task constraints. To support efficient modeling across different system scales, we introduce the Mamba-Attention Integrated Transformer (MAIT), a hybrid neural architecture that combines the long-range dependency modeling capability of attention mechanisms with the efficient temporal dynamic representation of Mamba. Furthermore, a Trajectory-Group Relative Policy Optimization (T-GRPO) method is proposed to achieve unified policy gradient optimization in both discrete and continuous trajectory spaces for MAIT training. Extensive experimental results validate the feasibility and effectiveness of the proposed ARMAIT framework.