Efficient Hybrid SE(3)-Equivariant Visuomotor Flow Policy via Spherical Harmonics for Robot Manipulation

📅 2026-03-24

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Existing equivariant methods suffer from high computational costs, reliance on single-modality inputs, and poor stability when combined with fast sampling, making it challenging to balance efficiency and performance. This work proposes E3Flow, a novel framework that unifies efficient rectified flow with multimodal SE(3)-equivariant learning for the first time. By leveraging spherical harmonics, E3Flow achieves strict SO(3) equivariance and introduces a Feature Enhancement Module (FEM) to dynamically fuse point cloud and image information. Evaluated on eight simulated tasks from MimicGen, E3Flow improves the average success rate by 3.12% over state-of-the-art methods while achieving a 7× speedup in inference. Its effectiveness is further validated through four real-world robotic experiments.

Technology Category

Application Category

📝 Abstract

While existing equivariant methods enhance data efficiency, they suffer from high computational intensity, reliance on single-modality inputs, and instability when combined with fast-sampling methods. In this work, we propose E3Flow, a novel framework that addresses the critical limitations of equivariant diffusion policies. E3Flow overcomes these challenges, successfully unifying efficient rectified flow with stable, multi-modal equivariant learning for the first time. Our framework is built upon spherical harmonic representations to ensure rigorous SO(3) equivariance. We introduce a novel invariant Feature Enhancement Module (FEM) that dynamically fuses hybrid visual modalities (point clouds and images), injecting rich visual cues into the spherical harmonic features. We evaluate E3Flow on 8 manipulation tasks from the MimicGen and further conduct 4 real-world experiments to validate its effectiveness in physical environments. Simulation results show that E3Flow achieves a 3.12% improvement in average success rate over the state-of-the-art Spherical Diffusion Policy (SDP) while simultaneously delivering a 7x inference speedup. E3Flow thus demonstrates a new and highly effective trade-off between performance, efficiency, and data efficiency for robotic policy learning. Code: https://github.com/zql-kk/E3Flow.

Problem

Research questions and friction points this paper is trying to address.

equivariant methods

computational intensity

multi-modality

fast-sampling instability

robot manipulation

Innovation

Methods, ideas, or system contributions that make the work stand out.

SE(3)-equivariance

spherical harmonics

rectified flow

multi-modal fusion

robotic manipulation

🔎 Similar Papers

No similar papers found.