Where Did It Go Wrong? Capability-Oriented Failure Attribution for Vision-and-Language Navigation Agents

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

202K/year
🤖 AI Summary
This work addresses the challenge of fault localization in vision-and-language navigation agents, whose integration of perception, memory, planning, and decision-making complicates the identification of failure sources in safety-critical scenarios. The paper proposes a capability-oriented testing framework that, for the first time, attributes failures to specific functional capabilities. By integrating adaptive test generation—based on seed selection and mutation—with capability-specific oracles and a feedback-driven iterative mechanism, the approach enables efficient detection and precise attribution of agent failures. Compared to existing methods, it uncovers a greater number of fault cases and provides interpretable, actionable diagnoses of capability deficiencies, thereby offering concrete guidance for model improvement.
📝 Abstract
Embodied agents in safety-critical applications such as Vision-Language Navigation (VLN) rely on multiple interdependent capabilities (e.g., perception, memory, planning, decision), making failures difficult to localize and attribute. Existing testing methods are largely system-level and provide limited insight into which capability deficiencies cause task failures. We propose a capability-oriented testing approach that enables failure detection and attribution by combining (1) adaptive test case generation via seed selection and mutation, (2) capability oracles for identifying capability-specific errors, and (3) a feedback mechanism that attributes failures to capabilities and guides further test generation. Experiments show that our method discovers more failure cases and more accurately pinpoints capability-level deficiencies than state-of-the-art baselines, providing more interpretable and actionable guidance for improving embodied agents.
Problem

Research questions and friction points this paper is trying to address.

Vision-Language Navigation
Failure Attribution
Embodied Agents
Capability Deficiency
System Testing
Innovation

Methods, ideas, or system contributions that make the work stand out.

capability-oriented testing
failure attribution
vision-language navigation
adaptive test generation
embodied agents
🔎 Similar Papers
2024-03-15IEEE/RJS International Conference on Intelligent RObots and SystemsCitations: 4
J
Jianming Chen
Institute of Software, Chinese Academy of Sciences, Beijing, China; Science & Technology on Integrated Information System Laboratory, Beijing, China; State Key Laboratory of Complex System Modeling and Simulation Technology, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
Yawen Wang
Yawen Wang
The University of Texas at Arlington
Gear DynamicsNoise and Vibration
Junjie Wang
Junjie Wang
Institute of Software, Chinese Academy of Sciences
Software Engineering
Xiaofei Xie
Xiaofei Xie
Singapore Management University
Software EngineeringLoop AnalysisTestingDeep Learning
Shoubin Li
Shoubin Li
Institute of Software, Chinese Academy of Sciences
Knowledge Graph
Qing Wang
Qing Wang
Institute of Software Chinese Academy of Sciences
Software engineering
F
Fanjiang Xu
Institute of Software, Chinese Academy of Sciences, Beijing, China; Science & Technology on Integrated Information System Laboratory, Beijing, China; State Key Laboratory of Complex System Modeling and Simulation Technology, Beijing, China; University of Chinese Academy of Sciences, Beijing, China