VULCAN: Vision-Language-Model Enhanced Multi-Agent Cooperative Navigation for Indoor Fire-Disaster Response

๐Ÿ“… 2026-04-14
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

228K/year
๐Ÿค– AI Summary
This work addresses the significant performance degradation of existing vision-based multi-agent navigation methods in dynamic, hazardous environments such as fires, which undermines their reliability for search-and-rescue missions. To overcome this limitation, we introduceโ€” for the first timeโ€”a vision-language model (VLM) into multi-agent collaborative navigation under fire conditions, integrating multimodal perception with semantic understanding. We propose a robust cooperative exploration mechanism tailored to handle smoke, high temperatures, and sensor degradation, and further develop the first physically realistic fire simulation benchmark by extending Habitat-Matterport3D. Experimental results demonstrate that our approach substantially outperforms current baselines in simulated fire scenarios, uncovering critical failure modes of conventional methods and confirming the decisive role of hazard awareness and semantic reasoning in effective search-and-rescue operations.

Technology Category

Application Category

๐Ÿ“ Abstract
Indoor fire disasters pose severe challenges to autonomous search and rescue due to dense smoke, high temperatures, and dynamically evolving indoor environments. In such time-critical scenarios, multi-agent cooperative navigation is particularly useful, as it enables faster and broader exploration than single-agent approaches. However, existing multi-agent navigation systems are primarily vision-based and designed for benign indoor settings, leading to significant performance degradation under fire-driven dynamic conditions. In this paper, we present VULCAN, a multi-agent cooperative navigation framework based on multi-modal perception and vision-language models (VLMs), tailored for indoor fire disaster response. We extend the Habitat-Matterport3D benchmark by simulating physically realistic fire scenarios, including smoke diffusion, thermal hazards, and sensor degradation. We evaluate representative multi-agent cooperative navigation baselines under both normal and fire-driven environments. Our results reveal critical failure modes of existing methods in fire scenarios and underscore the necessity of robust perception and hazard-aware planning for reliable multi-agent search and rescue.
Problem

Research questions and friction points this paper is trying to address.

multi-agent navigation
indoor fire disaster
vision-language models
autonomous search and rescue
hazard-aware planning
Innovation

Methods, ideas, or system contributions that make the work stand out.

vision-language models
multi-agent navigation
fire-disaster response
multi-modal perception
hazard-aware planning
๐Ÿ”Ž Similar Papers
No similar papers found.