🤖 AI Summary
Imitation learning suffers from error accumulation in long-horizon, high-precision control tasks; existing residual policies typically rely on local closed-loop correction without globally modeling state evolution, limiting generalization and robustness. This paper proposes the Koopman-guided Online Residual Refinement (KORR) framework, which innovatively employs the Koopman operator to construct a linear time-invariant latent dynamical model—enabling explicit global modeling of system evolution and accurate long-horizon prediction. Building upon predicted latent states, KORR performs closed-loop residual correction. Evaluated on multi-disturbance robotic furniture assembly—a challenging long-horizon manipulation task—KORR significantly improves task success rate, environmental robustness, and cross-scenario generalization over state-of-the-art baselines.
📝 Abstract
Imitation learning (IL) enables efficient skill acquisition from demonstrations but often struggles with long-horizon tasks and high-precision control due to compounding errors. Residual policy learning offers a promising, model-agnostic solution by refining a base policy through closed-loop corrections. However, existing approaches primarily focus on local corrections to the base policy, lacking a global understanding of state evolution, which limits robustness and generalization to unseen scenarios. To address this, we propose incorporating global dynamics modeling to guide residual policy updates. Specifically, we leverage Koopman operator theory to impose linear time-invariant structure in a learned latent space, enabling reliable state transitions and improved extrapolation for long-horizon prediction and unseen environments. We introduce KORR (Koopman-guided Online Residual Refinement), a simple yet effective framework that conditions residual corrections on Koopman-predicted latent states, enabling globally informed and stable action refinement. We evaluate KORR on long-horizon, fine-grained robotic furniture assembly tasks under various perturbations. Results demonstrate consistent gains in performance, robustness, and generalization over strong baselines. Our findings further highlight the potential of Koopman-based modeling to bridge modern learning methods with classical control theory.