Dense-Jump Flow Matching with Non-Uniform Time Scheduling for Robotic Policies: Mitigating Multi-Step Inference Degradation

๐Ÿ“… 2025-09-16
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Flow matching for robot policy learning suffers from two key issues: (1) premature saturation of generalization performance during early training, and (2) degraded inference performance when increasing Euler integration stepsโ€”stemming from uniform time sampling (causing oversampling at later times) and numerical instability induced by non-Lipschitz velocity fields near $t = 1$. To address these, we propose a U-shaped non-uniform time schedule that emphasizes modeling fidelity at critical initial and final time intervals, coupled with a dense skip integration mechanism that bypasses numerically unstable regions during inference, enabling stable single-step prediction. Our approach integrates time-aware flow matching, robust integration strategies, and explicit handling of non-Lipschitz dynamics. Evaluated on multi-task robotic benchmarks, it achieves an average +18.3% improvement over SOTA methods, with peak gains up to +23.7%, effectively mitigating the multi-step inference degradation commonly observed in existing flow-based policies.

Technology Category

Application Category

๐Ÿ“ Abstract
Flow matching has emerged as a competitive framework for learning high-quality generative policies in robotics; however, we find that generalisation arises and saturates early along the flow trajectory, in accordance with recent findings in the literature. We further observe that increasing the number of Euler integration steps during inference counter-intuitively and universally degrades policy performance. We attribute this to (i) additional, uniformly spaced integration steps oversample the late-time region, thereby constraining actions towards the training trajectories and reducing generalisation; and (ii) the learned velocity field becoming non-Lipschitz as integration time approaches 1, causing instability. To address these issues, we propose a novel policy that utilises non-uniform time scheduling (e.g., U-shaped) during training, which emphasises both early and late temporal stages to regularise policy training, and a dense-jump integration schedule at inference, which uses a single-step integration to replace the multi-step integration beyond a jump point, to avoid unstable areas around 1. Essentially, our policy is an efficient one-step learner that still pushes forward performance through multi-step integration, yielding up to 23.7% performance gains over state-of-the-art baselines across diverse robotic tasks.
Problem

Research questions and friction points this paper is trying to address.

Addresses multi-step inference degradation in robotic policies
Mitigates oversampling in late-time flow matching regions
Resolves instability from non-Lipschitz velocity fields
Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-uniform time scheduling training
Dense-jump integration inference
Single-step replacement unstable areas
๐Ÿ”Ž Similar Papers
No similar papers found.
Z
Zidong Chen
Dept. Computing, Imperial College London
Z
Zihao Guo
Dept. Computing, Manchester Metropolitan University
P
Peng Wang
Dept. Computing, Manchester Metropolitan University
T
ThankGod Itua Egbe
Dept. Computing, Manchester Metropolitan University
Y
Yan Lyu
School of Computer Science and Engineering, Southeast University
Chenghao Qian
Chenghao Qian
Ph.D student, University of Leeds | @Parallel Domain | Ex. XPENG
Generative ModelsAutonomous DrivingAdverse Weather