Learning Generalizable Visuomotor Policy through Dynamics-Alignment

📅 2025-10-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Behavior cloning suffers from poor generalization due to scarcity of expert demonstrations, while existing video prediction models lack control-input awareness and struggle to support precise manipulation tasks. To address these limitations, this paper proposes Dynamic Alignment Flow Matching (DAP), a novel framework that establishes a mutual feedback loop between a policy network and a dynamics model. DAP employs a dynamics-aware architecture to jointly optimize action generation and state evolution, and introduces self-correcting dynamic alignment during inference to enhance out-of-distribution (OOD) adaptability. Evaluated on real-world robotic manipulation tasks, DAP significantly outperforms mainstream baselines—particularly under OOD conditions such as visual occlusions and lighting variations—demonstrating superior robustness and generalization capability.

Technology Category

Application Category

📝 Abstract
Behavior cloning methods for robot learning suffer from poor generalization due to limited data support beyond expert demonstrations. Recent approaches leveraging video prediction models have shown promising results by learning rich spatiotemporal representations from large-scale datasets. However, these models learn action-agnostic dynamics that cannot distinguish between different control inputs, limiting their utility for precise manipulation tasks and requiring large pretraining datasets. We propose a Dynamics-Aligned Flow Matching Policy (DAP) that integrates dynamics prediction into policy learning. Our method introduces a novel architecture where policy and dynamics models provide mutual corrective feedback during action generation, enabling self-correction and improved generalization. Empirical validation demonstrates generalization performance superior to baseline methods on real-world robotic manipulation tasks, showing particular robustness in OOD scenarios including visual distractions and lighting variations.
Problem

Research questions and friction points this paper is trying to address.

Learning action-aware dynamics for robotic manipulation tasks
Improving policy generalization beyond expert demonstration data
Enhancing robustness against visual distractions and lighting variations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamics prediction integrated into policy learning
Mutual corrective feedback between policy and dynamics
Self-correction mechanism for improved generalization
🔎 Similar Papers
No similar papers found.
D
Dohyeok Lee
Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
J
Jung Min Lee
Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
M
Munkyung Kim
Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
S
Seokhun Ju
Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
J
Jin Woo Koo
Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
K
Kyungjae Lee
Department of Statistics, Korea University, Seoul 02841, Korea
Dohyeong Kim
Dohyeong Kim
Seoul National University
Reinforcement LearningRobotics
T
TaeHyun Cho
Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
Jungwoo Lee
Jungwoo Lee
Professor, Department of Electrical and Computer Engineering, Seoul National University
Machine LearningDistributed ComputingInformation Theory