Efficient Hybrid SE(3)-Equivariant Visuomotor Flow Policy via Spherical Harmonics for Robot Manipulation

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing equivariant methods suffer from high computational costs, reliance on single-modality inputs, and poor stability when combined with fast sampling, making it challenging to balance efficiency and performance. This work proposes E3Flow, a novel framework that unifies efficient rectified flow with multimodal SE(3)-equivariant learning for the first time. By leveraging spherical harmonics, E3Flow achieves strict SO(3) equivariance and introduces a Feature Enhancement Module (FEM) to dynamically fuse point cloud and image information. Evaluated on eight simulated tasks from MimicGen, E3Flow improves the average success rate by 3.12% over state-of-the-art methods while achieving a 7× speedup in inference. Its effectiveness is further validated through four real-world robotic experiments.

Technology Category

Application Category

📝 Abstract
While existing equivariant methods enhance data efficiency, they suffer from high computational intensity, reliance on single-modality inputs, and instability when combined with fast-sampling methods. In this work, we propose E3Flow, a novel framework that addresses the critical limitations of equivariant diffusion policies. E3Flow overcomes these challenges, successfully unifying efficient rectified flow with stable, multi-modal equivariant learning for the first time. Our framework is built upon spherical harmonic representations to ensure rigorous SO(3) equivariance. We introduce a novel invariant Feature Enhancement Module (FEM) that dynamically fuses hybrid visual modalities (point clouds and images), injecting rich visual cues into the spherical harmonic features. We evaluate E3Flow on 8 manipulation tasks from the MimicGen and further conduct 4 real-world experiments to validate its effectiveness in physical environments. Simulation results show that E3Flow achieves a 3.12% improvement in average success rate over the state-of-the-art Spherical Diffusion Policy (SDP) while simultaneously delivering a 7x inference speedup. E3Flow thus demonstrates a new and highly effective trade-off between performance, efficiency, and data efficiency for robotic policy learning. Code: https://github.com/zql-kk/E3Flow.
Problem

Research questions and friction points this paper is trying to address.

equivariant methods
computational intensity
multi-modality
fast-sampling instability
robot manipulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

SE(3)-equivariance
spherical harmonics
rectified flow
multi-modal fusion
robotic manipulation
🔎 Similar Papers
No similar papers found.