Streaming Flow Policy: Simplifying diffusion$/$flow-matching policies by treating action trajectories as flow trajectories

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

To address high computational redundancy, substantial execution latency, and poor real-time action generation in diffusion/flow-matching–based robot control, this paper proposes the “Action-as-Flow” modeling paradigm: directly parameterizing action trajectories as continuous flow fields to enable online, streaming action generation and real-time execution during sampling. Methodologically, we integrate flow matching, incremental velocity field integration, narrow Gaussian initialization, demonstration-trajectory–anchored regularization, and model predictive control (MPC) with receding horizon optimization. Our core contribution is the decoupling of sampling and execution—eliminating iterative sampling delays—and thereby drastically reducing sensor-to-actuator closed-loop latency. In imitation learning benchmarks, our approach preserves multimodal trajectory modeling capability and distributional stability while achieving significantly faster execution speeds and surpassing state-of-the-art methods in control performance.

Technology Category

Application Category

📝 Abstract

Recent advances in diffusion$/$flow-matching policies have enabled imitation learning of complex, multi-modal action trajectories. However, they are computationally expensive because they sample a trajectory of trajectories: a diffusion$/$flow trajectory of action trajectories. They discard intermediate action trajectories, and must wait for the sampling process to complete before any actions can be executed on the robot. We simplify diffusion$/$flow policies by treating action trajectories as flow trajectories. Instead of starting from pure noise, our algorithm samples from a narrow Gaussian around the last action. Then, it incrementally integrates a velocity field learned via flow matching to produce a sequence of actions that constitute a single trajectory. This enables actions to be streamed to the robot on-the-fly during the flow sampling process, and is well-suited for receding horizon policy execution. Despite streaming, our method retains the ability to model multi-modal behavior. We train flows that stabilize around demonstration trajectories to reduce distribution shift and improve imitation learning performance. Streaming flow policy outperforms prior methods while enabling faster policy execution and tighter sensorimotor loops for learning-based robot control. Project website: https://streaming-flow-policy.github.io/

Problem

Research questions and friction points this paper is trying to address.

Simplifying diffusion/flow-matching policies for faster execution

Enabling real-time action streaming during flow sampling

Improving imitation learning by reducing distribution shift

Innovation

Methods, ideas, or system contributions that make the work stand out.

Treats action trajectories as flow trajectories

Samples from narrow Gaussian around last action

Incrementally integrates learned velocity field

🔎 Similar Papers

Bellman Diffusion Models

2024-07-16arXiv.orgCitations: 2

Learning a Diffusion Model Policy from Rewards via Q-Score Matching

2023-12-18International Conference on Machine LearningCitations: 14