๐ค AI Summary
To address high inference latency, architectural complexity, and poor few-shot generalization in generative policies for robotic dexterous manipulation, this paper proposes MeanFlow++, a mean-flow policy enhancement framework. Methodologically, it (1) introduces a cosine-direction alignment loss to decouple velocity direction and magnitude calibration, and (2) models dynamic trajectory evolution via differential derivative equations (DDEs), coupled with Jacobian-vector product (JVP)-based optimization to replace fixed-temperature denoising lossesโenabling cooperative alignment between predicted mean velocities and ground-truth trajectories. The framework preserves single-step inference while significantly reducing computational overhead. Evaluated on Adroit and Meta-World benchmarks, MeanFlow++ achieves higher average success rates than MP1 and FlowPolicy, with particularly notable gains on challenging Meta-World tasks. It thus strikes an effective balance between real-time execution and trajectory fidelity.
๐ Abstract
Robot manipulation, a key capability of embodied AI, has turned to data-driven generative policy frameworks, but mainstream approaches like Diffusion Models suffer from high inference latency and Flow-based Methods from increased architectural complexity. While simply applying meanFlow on robotic tasks achieves single-step inference and outperforms FlowPolicy, it lacks few-shot generalization due to fixed temperature hyperparameters in its Dispersive Loss and misaligned predicted-true mean velocities. To solve these issues, this study proposes an improved MeanFlow-based Policies: we introduce a lightweight Cosine Loss to align velocity directions and use the Differential Derivation Equation (DDE) to optimize the Jacobian-Vector Product (JVP) operator. Experiments on Adroit and Meta-World tasks show the proposed method outperforms MP1 and FlowPolicy in average success rate, especially in challenging Meta-World tasks, effectively enhancing few-shot generalization and trajectory accuracy of robot manipulation policies while maintaining real-time performance, offering a more robust solution for high-precision robotic manipulation.