RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction

📅 2025-05-18

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

Current Vision-Language-Action (VLA) models exhibit limited failure recovery capability in open-world robotic manipulation, primarily due to training data consisting almost exclusively of successful demonstrations—lacking explicit modeling of failure modes and corresponding recovery strategies. To address this, we propose RoboFAC, a novel framework comprising: (1) the first large-scale dataset of robot failure trajectories paired with natural-language question-answer annotations; (2) the first systematic taxonomy and end-to-end architecture for joint failure analysis and corrective action generation; and (3) a question-answering–based fine-tuning paradigm with an external supervision mechanism enabling robust sim-to-real transfer. On our newly established benchmark, RoboFAC outperforms GPT-4o by 34.1% in failure diagnosis and recovery accuracy. When integrated into real-world VLA systems, it improves average success rates across four diverse manipulation tasks by 29.1%, significantly enhancing robustness in open-ended environments.

Technology Category

Application Category

📝 Abstract

Vision-Language-Action (VLA) models have recently advanced robotic manipulation by translating natural-language instructions and image information into sequential control actions. However, these models often underperform in open-world scenarios, as they are predominantly trained on successful expert demonstrations and exhibit a limited capacity for failure recovery. In this work, we present a Robotic Failure Analysis and Correction (RoboFAC) framework to address this issue. Firstly, we construct RoboFAC dataset comprising 9,440 erroneous manipulation trajectories and 78,623 QA pairs across 16 diverse tasks and 53 scenes in both simulation and real-world environments. Leveraging our dataset, we develop RoboFAC model, which is capable of Task Understanding, Failure Analysis and Failure Correction. Experimental results demonstrate that the RoboFAC model outperforms GPT-4o by 34.1% on our evaluation benchmark. Furthermore, we integrate the RoboFAC model into a real-world VLA control pipeline as an external supervision providing correction instructions, yielding a 29.1% relative improvement on average on four real-world tasks. The results show that our RoboFAC framework effectively handles robotic failures and assists the VLA model in recovering from failures.

Problem

Research questions and friction points this paper is trying to address.

Addressing VLA models' limited failure recovery in open-world scenarios

Providing a dataset for robotic failure analysis and correction

Enhancing VLA model performance via external failure supervision

Innovation

Methods, ideas, or system contributions that make the work stand out.

RoboFAC dataset with 9,440 error trajectories

Model performs Task Understanding and Failure Correction

Integrates as external supervision for VLA control

🔎 Similar Papers

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey