UAV-VLRR: Vision-Language Informed NMPC for Rapid Response in UAV Search and Rescue

πŸ“… 2025-03-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the slow response and low target recognition accuracy of manually piloted drones in complex, time-critical search-and-rescue (SAR) scenarios, this paper proposes a vision-language-driven rapid-response system. The method introduces a novel tightly coupled architecture integrating vision-language multimodal semantic understanding with nonlinear model predictive control (NMPC). It leverages a vision-language model (VLM) and ChatGPT-4o for natural language instruction parsing and contextual environmental understanding, while embedding real-time obstacle perception and safety-constrained trajectory optimization. Experimental results demonstrate that the system achieves millisecond-level command response latency, outperforming commercial autopilots by 33.75% in task completion speed and human pilots by 54.6%. This advancement significantly enhances operational efficiency, flight safety, and intuitive human–drone interaction in SAR missions.

Technology Category

Application Category

πŸ“ Abstract
Emergency search and rescue (SAR) operations often require rapid and precise target identification in complex environments where traditional manual drone control is inefficient. In order to address these scenarios, a rapid SAR system, UAV-VLRR (Vision-Language-Rapid-Response), is developed in this research. This system consists of two aspects: 1) A multimodal system which harnesses the power of Visual Language Model (VLM) and the natural language processing capabilities of ChatGPT-4o (LLM) for scene interpretation. 2) A non-linearmodel predictive control (NMPC) with built-in obstacle avoidance for rapid response by a drone to fly according to the output of the multimodal system. This work aims at improving response times in emergency SAR operations by providing a more intuitive and natural approach to the operator to plan the SAR mission while allowing the drone to carry out that mission in a rapid and safe manner. When tested, our approach was faster on an average by 33.75% when compared with an off-the-shelf autopilot and 54.6% when compared with a human pilot. Video of UAV-VLRR: https://youtu.be/KJqQGKKt1xY
Problem

Research questions and friction points this paper is trying to address.

Enhances rapid target identification in complex SAR environments.
Integrates VLM and LLM for intuitive scene interpretation and mission planning.
Utilizes NMPC with obstacle avoidance for faster, safer drone response.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal system combining VLM and ChatGPT-4o
NMPC with obstacle avoidance for rapid drone response
Improved SAR response times by 33.75%-54.6%
πŸ”Ž Similar Papers
No similar papers found.