🤖 AI Summary
Detecting, isolating, and containing context-dependent failures in robotic perception systems—particularly those propagating across interconnected modules—remains challenging. To address this, we propose the first counterfactual reasoning–based fault diagnosis framework tailored for perception systems. Methodologically, we construct a causal-graph–driven analytical redundancy model and introduce a novel effective information metric to quantify module-level reliability. We further design a causal multi-armed bandit model that integrates Monte Carlo tree search with upper confidence bounds to enable active fault isolation. The framework synergizes passive monitoring with active intervention mechanisms. Evaluated in a space robot visual navigation scenario, it accurately identifies three representative failure modes—sensor damage, dynamic interference, and perceptual degradation—with a 27.3% improvement in diagnostic accuracy and a 41.5% reduction in mean fault localization latency. The approach significantly enhances system robustness and interpretability while enabling principled, causally grounded fault analysis.
📝 Abstract
Perception systems provide a rich understanding of the environment for autonomous systems, shaping decisions in all downstream modules. Hence, accurate detection and isolation of faults in perception systems is important. Faults in perception systems pose particular challenges: faults are often tied to the perceptual context of the environment, and errors in their multi-stage pipelines can propagate across modules. To address this, we adopt a counterfactual reasoning approach to propose a framework for fault detection and isolation (FDI) in perception systems. As opposed to relying on physical redundancy (i.e., having extra sensors), our approach utilizes analytical redundancy with counterfactual reasoning to construct perception reliability tests as causal outcomes influenced by system states and fault scenarios. Counterfactual reasoning generates reliability test results under hypothesized faults to update the belief over fault hypotheses. We derive both passive and active FDI methods. While the passive FDI can be achieved by belief updates, the active FDI approach is defined as a causal bandit problem, where we utilize Monte Carlo Tree Search (MCTS) with upper confidence bound (UCB) to find control inputs that maximize a detection and isolation metric, designated as Effective Information (EI). The mentioned metric quantifies the informativeness of control inputs for FDI. We demonstrate the approach in a robot exploration scenario, where a space robot performing vision-based navigation actively adjusts its attitude to increase EI and correctly isolate faults caused by sensor damage, dynamic scenes, and perceptual degradation.