World Models for Anomaly Detection during Model-Based Reinforcement Learning Inference

📅 2025-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Model-based reinforcement learning (MBRL) controllers often fail to operate safely when deployed in unknown or degraded environments. Method: This paper introduces a world-model-based online anomaly detection and safety response mechanism. It repurposes pre-trained deterministic or probabilistic world models (e.g., RSSM, DreamerV3) as real-time safety monitors by leveraging multi-step prediction-observation residual statistics, adaptive thresholding, and closed-loop feedback—enabling unsupervised, task-agnostic anomaly detection during inference. Contribution/Results: To our knowledge, this is the first work to elevate world models from training tools to online safety guardians without requiring prior task-specific knowledge. Evaluated in simulation, the method successfully detects abrupt local geometric and gravitational changes; on a physical quadcopter platform, it accurately identifies external disturbance forces, significantly enhancing system robustness and operational safety.

Technology Category

Application Category

📝 Abstract
Learning-based controllers are often purposefully kept out of real-world applications due to concerns about their safety and reliability. We explore how state-of-the-art world models in Model-Based Reinforcement Learning can be utilized beyond the training phase to ensure a deployed policy only operates within regions of the state-space it is sufficiently familiar with. This is achieved by continuously monitoring discrepancies between a world model's predictions and observed system behavior during inference. It allows for triggering appropriate measures, such as an emergency stop, once an error threshold is surpassed. This does not require any task-specific knowledge and is thus universally applicable. Simulated experiments on established robot control tasks show the effectiveness of this method, recognizing changes in local robot geometry and global gravitational magnitude. Real-world experiments using an agile quadcopter further demonstrate the benefits of this approach by detecting unexpected forces acting on the vehicle. These results indicate how even in new and adverse conditions, safe and reliable operation of otherwise unpredictable learning-based controllers can be achieved.
Problem

Research questions and friction points this paper is trying to address.

Ensuring safety of learning-based controllers in real-world applications
Detecting anomalies during reinforcement learning inference using world models
Monitoring discrepancies between predicted and observed system behavior
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes world models for anomaly detection
Monitors discrepancies during reinforcement learning inference
Triggers safety measures upon surpassing error thresholds
F
Fabian Domberg
Autonomous Systems Lab (ASL), Institute for Electrical Engineering in Medicine, University of Lübeck, 23562 Lübeck, Germany
Georg Schildbach
Georg Schildbach
Professor of Mechatronics, University of Luebeck
dynamic systemscontrolssafety and reliabilityautomotive systemsdrones