Dreaming the Unseen: World Model-regularized Diffusion Policy for Out-of-Distribution Robustness

📅 2026-03-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion policies have demonstrated strong performance in visuomotor control but are prone to failure under severe out-of-distribution (OOD) perturbations, such as object displacement or visual corruption. This work proposes Dream Diffusion Policy (DDP), which jointly optimizes a diffusion world model and a diffusion policy through a shared 3D vision encoder, endowing the policy with robust state prediction capabilities. During inference, DDP leverages autoregressive latent dynamics to perform “imagined” decision-making. It further incorporates a reality–imagination discrepancy detection mechanism that, under OOD conditions, actively disregards corrupted visual inputs and relies instead on internal predictions. Experiments show that DDP achieves a 73.8% success rate under OOD settings in MetaWorld—substantially outperforming the baseline at 23.9%—and reaches 83.3% in real-world scenarios with severe spatial shifts, compared to the baseline’s 3.3%. Remarkably, DDP maintains a 76.7% success rate even in fully open-loop execution.

Technology Category

Application Category

📝 Abstract
Diffusion policies excel at visuomotor control but often fail catastrophically under severe out-of-distribution (OOD) disturbances, such as unexpected object displacements or visual corruptions. To address this vulnerability, we introduce the Dream Diffusion Policy (DDP), a framework that deeply integrates a diffusion world model into the policy's training objective via a shared 3D visual encoder. This co-optimization endows the policy with robust state-prediction capabilities. When encountering sudden OOD anomalies during inference, DDP detects the real-imagination discrepancy and actively abandons the corrupted visual stream. Instead, it relies on its internal "imagination" (autoregressively forecasted latent dynamics) to safely bypass the disruption, generating imagined trajectories before smoothly realigning with physical reality. Extensive evaluations demonstrate DDP's exceptional resilience. Notably, DDP achieves a 73.8% OOD success rate on MetaWorld (vs. 23.9% without predictive imagination) and an 83.3% success rate under severe real-world spatial shifts (vs. 3.3% without predictive imagination). Furthermore, as a stress test, DDP maintains a 76.7% real-world success rate even when relying entirely on open-loop imagination post-initialization.
Problem

Research questions and friction points this paper is trying to address.

out-of-distribution robustness
diffusion policy
visuomotor control
visual corruption
object displacement
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion policy
world model
out-of-distribution robustness
predictive imagination
3D visual encoder
🔎 Similar Papers
No similar papers found.
Z
Ziou Hu
School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
Xiangtong Yao
Xiangtong Yao
Ph.D. Student, Technische Universität München
Robot LearningRobotics
Y
Yuan Meng
School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
Zhenshan Bing
Zhenshan Bing
Nanjing University / Technical University of Munich
Robotics
Alois Knoll
Alois Knoll
Technische Universität München
RoboticsAISensor Data FusionAutonomous DrivingCyber Physical Systems