🤖 AI Summary
Rapid growth of the low-altitude economy has intensified challenges in intent recognition for non-cooperative unmanned aerial vehicles (UAVs), particularly in dynamic, contested low-altitude environments.
Method: This paper proposes a novel intent recognition architecture leveraging multimodal large language models (MLLMs)—the first such application in low-altitude adversarial scenarios. The framework fuses heterogeneous inputs—including UAV payload data, kinematic states, real-time environmental dynamics, and tactical prior knowledge—within a structured semantic reasoning pipeline to enable end-to-end behavioral intent understanding and generative inference.
Contribution/Results: Compared to conventional unimodal or rule-based approaches, the proposed architecture significantly enhances cognitive intelligence for non-cooperative targets under complex low-altitude conditions. Experimental evaluation on representative adversarial use cases demonstrates substantial improvements in intent recognition accuracy. The solution provides a deployable technical pathway and practical paradigm for advancing autonomy and intelligence in low-altitude security systems.
📝 Abstract
The rapid development of the low-altitude economy emphasizes the critical need for effective perception and intent recognition of non-cooperative unmanned aerial vehicles (UAVs). The advanced generative reasoning capabilities of multimodal large language models (MLLMs) present a promising approach in such tasks. In this paper, we focus on the combination of UAV intent recognition and the MLLMs. Specifically, we first present an MLLM-enabled UAV intent recognition architecture, where the multimodal perception system is utilized to obtain real-time payload and motion information of UAVs, generating structured input information, and MLLM outputs intent recognition results by incorporating environmental information, prior knowledge, and tactical preferences. Subsequently, we review the related work and demonstrate their progress within the proposed architecture. Then, a use case for low-altitude confrontation is conducted to demonstrate the feasibility of our architecture and offer valuable insights for practical system design. Finally, the future challenges are discussed, followed by corresponding strategic recommendations for further applications.