SwarmVLM: VLM-Guided Impedance Control for Autonomous Navigation of Heterogeneous Robots in Dynamic Warehousing

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

248K/year

🤖 AI Summary

Heterogeneous navigation coordination between unmanned aerial vehicles (UAVs) and automated guided vehicles (AGVs) in dynamic warehouse environments is challenged by UAV energy constraints, payload limitations, and stringent safety-critical collision avoidance requirements under complex conditions. Method: This paper proposes a vision-language model–retrieval-augmented generation (VLM-RAG)–driven semantic impedance coordination framework. It integrates VLM-based environmental semantic understanding with RAG-enhanced dynamic parameter generation for impedance control; further incorporates virtual impedance linkages and adaptive topology reconfiguration to enable multi-robot semantic collaboration. Contribution/Results: Unlike conventional artificial potential field (APF) or fixed-impedance approaches, the framework enables real-time, multimodal perception–guided response and cooperative obstacle avoidance. Evaluated across 12 realistic warehouse scenarios, it achieves a 92% task success rate. Under ideal illumination, VLM-RAG improves object recognition and control-parameter matching accuracy by 8%, while ground robots maintain stable safe following and dynamic collision avoidance performance.

Technology Category

Application Category

📝 Abstract

With the growing demand for efficient logistics, unmanned aerial vehicles (UAVs) are increasingly being paired with automated guided vehicles (AGVs). While UAVs offer the ability to navigate through dense environments and varying altitudes, they are limited by battery life, payload capacity, and flight duration, necessitating coordinated ground support. Focusing on heterogeneous navigation, SwarmVLM addresses these limitations by enabling semantic collaboration between UAVs and ground robots through impedance control. The system leverages the Vision Language Model (VLM) and the Retrieval-Augmented Generation (RAG) to adjust impedance control parameters in response to environmental changes. In this framework, the UAV acts as a leader using Artificial Potential Field (APF) planning for real-time navigation, while the ground robot follows via virtual impedance links with adaptive link topology to avoid collisions with short obstacles. The system demonstrated a 92% success rate across 12 real-world trials. Under optimal lighting conditions, the VLM-RAG framework achieved 8% accuracy in object detection and selection of impedance parameters. The mobile robot prioritized short obstacle avoidance, occasionally resulting in a lateral deviation of up to 50 cm from the UAV path, which showcases safe navigation in a cluttered setting.

Problem

Research questions and friction points this paper is trying to address.

Enables semantic collaboration between UAVs and ground robots for navigation

Adjusts impedance control parameters dynamically using VLM and RAG

Ensures safe navigation in cluttered environments with adaptive obstacle avoidance

Innovation

Methods, ideas, or system contributions that make the work stand out.

VLM-guided impedance control for robot navigation

Retrieval-Augmented Generation for parameter adjustment

Adaptive link topology for collision avoidance

🔎 Similar Papers

No similar papers found.

Field AI

Irvine, CA

Senior/Staff Engineer, Machine Learning - Online Mapping

Nuro

$183,825 and $333,925

Mountain View, California (HQ)

Authors to Follow