DEEDEE: Fast and Scalable Out-of-Distribution Dynamics Detection

📅 2025-10-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Reinforcement learning (RL) agents exhibit poor out-of-distribution (OOD) detection performance under distributional shift, undermining safety-critical deployment. Method: We propose a lightweight dual-statistic OOD detection framework that jointly leverages trajectory segment means and RBF-kernel similarity to a training-set summary—capturing local deviations and global distribution shifts, respectively. We theoretically establish that low-order statistics suffice to characterize diverse anomaly patterns, eliminating reliance on high-dimensional representations or complex models. Contribution/Results: Our method requires only a single forward pass and elementary statistical computations, reducing inference overhead by 600× compared to state-of-the-art approaches. On standard RL OOD benchmarks, it achieves a 5% average absolute accuracy gain, enables real-time inference, and scales efficiently to large systems. The approach delivers an efficient, interpretable, plug-and-play OOD detection paradigm for safety-critical RL.

Technology Category

Application Category

📝 Abstract
Deploying reinforcement learning (RL) in safety-critical settings is constrained by brittleness under distribution shift. We study out-of-distribution (OOD) detection for RL time series and introduce DEEDEE, a two-statistic detector that revisits representation-heavy pipelines with a minimal alternative. DEEDEE uses only an episodewise mean and an RBF kernel similarity to a training summary, capturing complementary global and local deviations. Despite its simplicity, DEEDEE matches or surpasses contemporary detectors across standard RL OOD suites, delivering a 600-fold reduction in compute (FLOPs / wall-time) and an average 5% absolute accuracy gain over strong baselines. Conceptually, our results indicate that diverse anomaly types often imprint on RL trajectories through a small set of low-order statistics, suggesting a compact foundation for OOD detection in complex environments.
Problem

Research questions and friction points this paper is trying to address.

Detects out-of-distribution dynamics in reinforcement learning
Addresses brittleness under distribution shift in safety-critical RL
Identifies anomalies using minimal low-order trajectory statistics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-statistic detector using mean and kernel similarity
Captures global and local deviations in RL trajectories
Reduces compute requirements by 600 times
🔎 Similar Papers
No similar papers found.
T
Tala Aljaafari
University of Oxford, Oxford, United Kingdom
Varun Kanade
Varun Kanade
University of Oxford
Machine LearningTheoryEvolution
Philip Torr
Philip Torr
Professor, University of Oxford
Department of Engineering
C
Christian Schröder de Witt
University of Oxford, Oxford, United Kingdom