🤖 AI Summary
In open-world environments, learned robotic policies often lack the capability to detect and respond to execution failures, leading to unreliable or unsafe behavior. To address this, we propose a robust execution framework integrating vision-based anomaly detection with a multi-level recovery mechanism. First, we develop a lightweight online visual anomaly detector that identifies trajectory deviations and human-induced disturbances in real time. Second, we design a three-stage sequential recovery process, combining anomaly-triggered activation with a learned success-prediction model to enable safe policy reset. Third, we enhance generalization via simulation pretraining, domain adaptation, and local state perturbation recovery. Evaluated on door-handle manipulation and object placement tasks under dynamic interference, our approach significantly improves task success rates. The framework provides a scalable solution for deploying reliable learning-based robots in unstructured, open environments.
📝 Abstract
Learned robot policies have consistently been shown to be versatile, but they typically have no built-in mechanism for handling the complexity of open environments, making them prone to execution failures; this implies that deploying policies without the ability to recognise and react to failures may lead to unreliable and unsafe robot behaviour. In this paper, we present a framework that couples a learned policy with a method to detect visual anomalies during policy deployment and to perform recovery behaviours when necessary, thereby aiming to prevent failures. Specifically, we train an anomaly detection model using data collected during nominal executions of a trained policy. This model is then integrated into the online policy execution process, so that deviations from the nominal execution can trigger a three-level sequential recovery process that consists of (i) pausing the execution temporarily, (ii) performing a local perturbation of the robot's state, and (iii) resetting the robot to a safe state by sampling from a learned execution success model. We verify our proposed method in two different scenarios: (i) a door handle reaching task with a Kinova Gen3 arm using a policy trained in simulation and transferred to the real robot, and (ii) an object placing task with a UFactory xArm 6 using a general-purpose policy model. Our results show that integrating policy execution with anomaly detection and recovery increases the execution success rate in environments with various anomalies, such as trajectory deviations and adversarial human interventions.