OMP: One-step Meanflow Policy with Directional Alignment

๐Ÿ“… 2025-12-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address high inference latency, architectural complexity, and poor few-shot generalization in generative policies for robotic dexterous manipulation, this paper proposes MeanFlow++, a mean-flow policy enhancement framework. Methodologically, it (1) introduces a cosine-direction alignment loss to decouple velocity direction and magnitude calibration, and (2) models dynamic trajectory evolution via differential derivative equations (DDEs), coupled with Jacobian-vector product (JVP)-based optimization to replace fixed-temperature denoising lossesโ€”enabling cooperative alignment between predicted mean velocities and ground-truth trajectories. The framework preserves single-step inference while significantly reducing computational overhead. Evaluated on Adroit and Meta-World benchmarks, MeanFlow++ achieves higher average success rates than MP1 and FlowPolicy, with particularly notable gains on challenging Meta-World tasks. It thus strikes an effective balance between real-time execution and trajectory fidelity.

Technology Category

Application Category

๐Ÿ“ Abstract
Robot manipulation, a key capability of embodied AI, has turned to data-driven generative policy frameworks, but mainstream approaches like Diffusion Models suffer from high inference latency and Flow-based Methods from increased architectural complexity. While simply applying meanFlow on robotic tasks achieves single-step inference and outperforms FlowPolicy, it lacks few-shot generalization due to fixed temperature hyperparameters in its Dispersive Loss and misaligned predicted-true mean velocities. To solve these issues, this study proposes an improved MeanFlow-based Policies: we introduce a lightweight Cosine Loss to align velocity directions and use the Differential Derivation Equation (DDE) to optimize the Jacobian-Vector Product (JVP) operator. Experiments on Adroit and Meta-World tasks show the proposed method outperforms MP1 and FlowPolicy in average success rate, especially in challenging Meta-World tasks, effectively enhancing few-shot generalization and trajectory accuracy of robot manipulation policies while maintaining real-time performance, offering a more robust solution for high-precision robotic manipulation.
Problem

Research questions and friction points this paper is trying to address.

Improves few-shot generalization in robot manipulation policies
Enhances trajectory accuracy while maintaining real-time performance
Addresses misaligned velocity predictions and fixed temperature hyperparameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight Cosine Loss aligns velocity directions
DDE optimizes Jacobian-Vector Product operator
Maintains real-time performance with enhanced generalization
๐Ÿ”Ž Similar Papers
No similar papers found.
H
Han Fang
Global College, Shanghai Jiao Tong University, Shanghai, China
Y
Yize Huang
School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China
Yuheng Zhao
Yuheng Zhao
Fudan University
Data VisualizationVisual AnalyticsHuman-AI Collaboration
Paul Weng
Paul Weng
Duke Kunshan University
Artificial IntelligenceReinforcement Learning/Markov Decision ProcessQualitative/Ordinal Models
X
Xiao Li
School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China
Y
Yutong Ban
Global College, Shanghai Jiao Tong University, Shanghai, China