π€ AI Summary
To address the slow response and low target recognition accuracy of manually piloted drones in complex, time-critical search-and-rescue (SAR) scenarios, this paper proposes a vision-language-driven rapid-response system. The method introduces a novel tightly coupled architecture integrating vision-language multimodal semantic understanding with nonlinear model predictive control (NMPC). It leverages a vision-language model (VLM) and ChatGPT-4o for natural language instruction parsing and contextual environmental understanding, while embedding real-time obstacle perception and safety-constrained trajectory optimization. Experimental results demonstrate that the system achieves millisecond-level command response latency, outperforming commercial autopilots by 33.75% in task completion speed and human pilots by 54.6%. This advancement significantly enhances operational efficiency, flight safety, and intuitive humanβdrone interaction in SAR missions.
π Abstract
Emergency search and rescue (SAR) operations often require rapid and precise target identification in complex environments where traditional manual drone control is inefficient. In order to address these scenarios, a rapid SAR system, UAV-VLRR (Vision-Language-Rapid-Response), is developed in this research. This system consists of two aspects: 1) A multimodal system which harnesses the power of Visual Language Model (VLM) and the natural language processing capabilities of ChatGPT-4o (LLM) for scene interpretation. 2) A non-linearmodel predictive control (NMPC) with built-in obstacle avoidance for rapid response by a drone to fly according to the output of the multimodal system. This work aims at improving response times in emergency SAR operations by providing a more intuitive and natural approach to the operator to plan the SAR mission while allowing the drone to carry out that mission in a rapid and safe manner. When tested, our approach was faster on an average by 33.75% when compared with an off-the-shelf autopilot and 54.6% when compared with a human pilot. Video of UAV-VLRR: https://youtu.be/KJqQGKKt1xY