One-Step Flow Policy: Self-Distillation for Fast Visuomotor Policies

📅 2026-03-12

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

This work addresses the high inference latency of generative flow and diffusion models in robotic control, which stems from iterative sampling and hinders real-time deployment. To overcome this limitation, the authors propose a self-distillation framework that requires no pretrained teacher model and enables single-step generation of high-fidelity actions for low-latency, high-accuracy visuomotor policies. The approach innovatively integrates a self-consistency loss, self-guided regularization, and a warm-start mechanism leveraging temporal action correlations to effectively reduce the generation trajectory length and enhance action quality. Evaluated across 56 simulated manipulation tasks, the method outperforms both 100-step diffusion and flow-based policies while achieving over 100× faster inference. It also surpasses the original 10-step policy when applied to the π₀.₅ model in RoboTwin 2.0.

Technology Category

Application Category

📝 Abstract

Generative flow and diffusion models provide the continuous, multimodal action distributions needed for high-precision robotic policies. However, their reliance on iterative sampling introduces severe inference latency, degrading control frequency and harming performance in time-sensitive manipulation. To address this problem, we propose the One-Step Flow Policy (OFP), a from-scratch self-distillation framework for high-fidelity, single-step action generation without a pre-trained teacher. OFP unifies a self-consistency loss to enforce coherent transport across time intervals, and a self-guided regularization to sharpen predictions toward high-density expert modes. In addition, a warm-start mechanism leverages temporal action correlations to minimize the generative transport distance. Evaluations across 56 diverse simulated manipulation tasks demonstrate that a one-step OFP achieves state-of-the-art results, outperforming 100-step diffusion and flow policies while accelerating action generation by over $100\times$. We further integrate OFP into the $π_{0.5}$ model on RoboTwin 2.0, where one-step OFP surpasses the original 10-step policy. These results establish OFP as a practical, scalable solution for highly accurate and low-latency robot control.

Problem

Research questions and friction points this paper is trying to address.

visuomotor policies

inference latency

iterative sampling

robotic control

action generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

One-Step Flow Policy

self-distillation

visuomotor control