Imitation Learning Policy based on Multi-Step Consistent Integration Shortcut Model

📅 2025-10-22

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

To address the slow inference speed of flow-matching models in robot imitation learning, this paper proposes a single-step shortcut method with multi-step consistency ensembling. The approach tackles the problem by (1) decomposing the long-horizon flow-matching objective into parallel-optimizable sub-goals via a multi-step consistency loss mechanism, and (2) introducing an adaptive gradient allocation strategy that dynamically balances prediction accuracy and stability within a single-step inference. Inspired by knowledge distillation, the method preserves the expressive power of flow matching while drastically improving inference efficiency. Evaluated on two simulation benchmarks and five real-robot tasks, it achieves 3.2–5.8× faster inference over baseline flow-matching models, outperforms existing distillation and consistency-based methods in task performance, and exhibits improved training stability.

Technology Category

Application Category

📝 Abstract

The wide application of flow-matching methods has greatly promoted the development of robot imitation learning. However, these methods all face the problem of high inference time. To address this issue, researchers have proposed distillation methods and consistency methods, but the performance of these methods still struggles to compete with that of the original diffusion models and flow-matching models. In this article, we propose a one-step shortcut method with multi-step integration for robot imitation learning. To balance the inference speed and performance, we extend the multi-step consistency loss on the basis of the shortcut model, split the one-step loss into multi-step losses, and improve the performance of one-step inference. Secondly, to solve the problem of unstable optimization of the multi-step loss and the original flow-matching loss, we propose an adaptive gradient allocation method to enhance the stability of the learning process. Finally, we evaluate the proposed method in two simulation benchmarks and five real-world environment tasks. The experimental results verify the effectiveness of the proposed algorithm.

Problem

Research questions and friction points this paper is trying to address.

Addresses high inference time in robot imitation learning

Improves performance of one-step inference using multi-step losses

Stabilizes learning process with adaptive gradient allocation

Innovation

Methods, ideas, or system contributions that make the work stand out.

One-step shortcut method with multi-step integration

Multi-step consistency loss for improved performance

Adaptive gradient allocation for stable optimization

🔎 Similar Papers

Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate