Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360-Degree Firefighting Videos

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
AI systems exhibit severe perception and reasoning failures under extreme degradation conditions—such as smoke, low illumination, and thermal distortion—critical in safety-critical domains like firefighting. Method: This paper introduces Fire360, the first 360° degraded video benchmark specifically designed for fire rescue, encompassing five tasks: visual question answering, action description, object localization, safety reasoning, and *Transformed Object Retrieval* (TOR)—a novel task evaluating cross-modal recognition and memory of fire-damaged objects, emphasizing perception–memory–reasoning synergy. Fire360 is built upon professionally captured fire-training videos, augmented with fine-grained multimodal annotations and degradation metadata. Results: Human experts achieve 83.5% accuracy on TOR, whereas state-of-the-art vision-language models (e.g., GPT-4o) underperform significantly, exposing fundamental robustness limitations under extreme degradation. Fire360 establishes the first reproducible, task-driven evaluation standard for assessing trustworthy embodied intelligence in real-world safety-critical scenarios.

Technology Category

Application Category

📝 Abstract
Modern AI systems struggle most in environments where reliability is critical - scenes with smoke, poor visibility, and structural deformation. Each year, tens of thousands of firefighters are injured on duty, often due to breakdowns in situational perception. We introduce Fire360, a benchmark for evaluating perception and reasoning in safety-critical firefighting scenarios. The dataset includes 228 360-degree videos from professional training sessions under diverse conditions (e.g., low light, thermal distortion), annotated with action segments, object locations, and degradation metadata. Fire360 supports five tasks: Visual Question Answering, Temporal Action Captioning, Object Localization, Safety-Critical Reasoning, and Transformed Object Retrieval (TOR). TOR tests whether models can match pristine exemplars to fire-damaged counterparts in unpaired scenes, evaluating transformation-invariant recognition. While human experts achieve 83.5% on TOR, models like GPT-4o lag significantly, exposing failures in reasoning under degradation. By releasing Fire360 and its evaluation suite, we aim to advance models that not only see, but also remember, reason, and act under uncertainty. The dataset is available at: https://uofi.box.com/v/fire360dataset.
Problem

Research questions and friction points this paper is trying to address.

Evaluating AI perception in firefighting scenarios with poor visibility
Assessing reasoning in safety-critical conditions like smoke and deformation
Testing transformation-invariant recognition in degraded 360-degree videos
Innovation

Methods, ideas, or system contributions that make the work stand out.

360-degree video dataset for firefighting scenarios
Five tasks including Visual Question Answering
Transformed Object Retrieval for invariant recognition
🔎 Similar Papers
No similar papers found.