Real-Time Out-of-Distribution Failure Prevention via Multi-Modal Reasoning

📅 2025-05-15

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This work addresses the real-time fault-avoidance challenge for robots operating in out-of-distribution (OOD) hazardous environments. We propose a novel paradigm that tightly integrates low-frequency multimodal large language model (MLLM) reasoning with dynamics-aware motion planning. Instead of relying on hand-crafted fallback policies, our method employs an online vision-language model to semantically identify unsafe regions, augments runtime monitoring to anticipate failure modes, and—upon triggering—generates dynamically feasible, semantically safe motion fallback trajectories. Our key contribution is the first tight coupling of open-world multimodal reasoning with real-time motion planning under physical constraints. Evaluated on synthetic benchmarks, physical ANYmal quadruped platforms, and urban quadrotor navigation tasks, our approach achieves significant improvements in OOD safety classification accuracy and planning success rate, demonstrating both enhanced safety guarantees and strong generalization across unseen scenarios.

Technology Category

Application Category

📝 Abstract

Foundation models can provide robust high-level reasoning on appropriate safety interventions in hazardous scenarios beyond a robot's training data, i.e. out-of-distribution (OOD) failures. However, due to the high inference latency of Large Vision and Language Models, current methods rely on manually defined intervention policies to enact fallbacks, thereby lacking the ability to plan generalizable, semantically safe motions. To overcome these challenges we present FORTRESS, a framework that generates and reasons about semantically safe fallback strategies in real time to prevent OOD failures. At a low frequency in nominal operations, FORTRESS uses multi-modal reasoners to identify goals and anticipate failure modes. When a runtime monitor triggers a fallback response, FORTRESS rapidly synthesizes plans to fallback goals while inferring and avoiding semantically unsafe regions in real time. By bridging open-world, multi-modal reasoning with dynamics-aware planning, we eliminate the need for hard-coded fallbacks and human safety interventions. FORTRESS outperforms on-the-fly prompting of slow reasoning models in safety classification accuracy on synthetic benchmarks and real-world ANYmal robot data, and further improves system safety and planning success in simulation and on quadrotor hardware for urban navigation.

Problem

Research questions and friction points this paper is trying to address.

Prevent out-of-distribution robot failures in real time

Generate semantically safe fallback strategies dynamically

Bridge multi-modal reasoning with real-time planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal reasoning for real-time failure prevention

Dynamic planning to avoid unsafe regions

Bridging open-world reasoning with dynamics-aware planning

🔎 Similar Papers

No similar papers found.

Field AI

Irvine, CA

Researcher, Pretraining Safety

OpenAI

$295K – $445K • Offers Equity

San Francisco

Authors to Follow