DADP: Domain Adaptive Diffusion Policy

📅 2026-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited generalization of existing learning-based control policies in unseen dynamic environments, which stems from entangled domain representations that conflate static environmental context with dynamic characteristics, thereby hindering zero-shot adaptation. To resolve this, the authors propose an unsupervised disentanglement approach that separates static domain information from dynamic properties through lagged contextual dynamics prediction. The resulting disentangled domain representations are explicitly injected into both the prior distribution and target reconstruction process of a diffusion model, enabling domain-aware policy generation. By seamlessly integrating diffusion models with a reinforcement learning framework, the method achieves state-of-the-art performance across multiple challenging locomotion and manipulation tasks, demonstrating superior cross-domain generalization and robust zero-shot adaptation capabilities.

Technology Category

Application Category

📝 Abstract
Learning domain adaptive policies that can generalize to unseen transition dynamics, remains a fundamental challenge in learning-based control. Substantial progress has been made through domain representation learning to capture domain-specific information, thus enabling domain-aware decision making. We analyze the process of learning domain representations through dynamical prediction and find that selecting contexts adjacent to the current step causes the learned representations to entangle static domain information with varying dynamical properties. Such mixture can confuse the conditioned policy, thereby constraining zero-shot adaptation. To tackle the challenge, we propose DADP (Domain Adaptive Diffusion Policy), which achieves robust adaptation through unsupervised disentanglement and domain-aware diffusion injection. First, we introduce Lagged Context Dynamical Prediction, a strategy that conditions future state estimation on a historical offset context; by increasing this temporal gap, we unsupervisedly disentangle static domain representations by filtering out transient properties. Second, we integrate the learned domain representations directly into the generative process by biasing the prior distribution and reformulating the diffusion target. Extensive experiments on challenging benchmarks across locomotion and manipulation demonstrate the superior performance, and the generalizability of DADP over prior methods. More visualization results are available on the https://outsider86.github.io/DomainAdaptiveDiffusionPolicy/.
Problem

Research questions and friction points this paper is trying to address.

domain adaptation
transition dynamics
zero-shot adaptation
domain representation
policy generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Domain Adaptation
Diffusion Policy
Disentangled Representation
Zero-shot Generalization
Lagged Context Dynamical Prediction
🔎 Similar Papers
No similar papers found.
Pengcheng Wang
Pengcheng Wang
UC Berkeley
RoboticsControlReinforcement Learning
Q
Qinghang Liu
University of California, Berkeley, California, USA; Peking University, Beijing, China
Haotian Lin
Haotian Lin
Carnegie Mellon University
RoboticsAutonomous DrivingReinforcement Learning
Y
Yiheng Li
University of California, Berkeley, California, USA
G
Guojian Zhan
University of California, Berkeley, California, USA; Tsinghua University, Beijing, China
Masayoshi Tomizuka
Masayoshi Tomizuka
Mechaniccal Engineering, University of California
mechanical engineeringdynamic systemscontrolmechatronics
Yixiao Wang
Yixiao Wang
University of California, Berkeley
roboticsdiffusion modelstrajectory prediction