MambaFlow: A Novel and Flow-guided State Space Model for Scene Flow Estimation

📅 2025-02-24

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

Scene flow estimation suffers from fine-grained geometric information loss due to voxelization and insufficient spatiotemporal modeling capability. To address these challenges, we propose a novel 3D scene flow estimation framework for point cloud sequences. Our method introduces a flow-guided Mamba state-space decoder that incorporates point-level displacement priors into the global modeling of voxel features. We further design a voxel-point feature decoupling and remapping mechanism to mitigate voxelization-induced distortion, and formulate a scene-adaptive loss function that dynamically weights regression errors across diverse motion patterns. Evaluated on the Argoverse 2 dataset, our approach achieves state-of-the-art performance while maintaining real-time inference speed. It significantly improves estimation accuracy and robustness in complex urban dynamic scenes.

Technology Category

Application Category

📝 Abstract

Scene flow estimation aims to predict 3D motion from consecutive point cloud frames, which is of great interest in autonomous driving field. Existing methods face challenges such as insufficient spatio-temporal modeling and inherent loss of fine-grained feature during voxelization. However, the success of Mamba, a representative state space model (SSM) that enables global modeling with linear complexity, provides a promising solution. In this paper, we propose MambaFlow, a novel scene flow estimation network with a mamba-based decoder. It enables deep interaction and coupling of spatio-temporal features using a well-designed backbone. Innovatively, we steer the global attention modeling of voxel-based features with point offset information using an efficient Mamba-based decoder, learning voxel-to-point patterns that are used to devoxelize shared voxel representations into point-wise features. To further enhance the model's generalization capabilities across diverse scenarios, we propose a novel scene-adaptive loss function that automatically adapts to different motion patterns.Extensive experiments on the Argoverse 2 benchmark demonstrate that MambaFlow achieves state-of-the-art performance with real-time inference speed among existing works, enabling accurate flow estimation in real-world urban scenarios. The code is available at https://github.com/SCNU-RISLAB/MambaFlow.

Problem

Research questions and friction points this paper is trying to address.

Predict 3D motion from point cloud frames.

Address spatio-temporal modeling challenges.

Enhance generalization with adaptive loss function.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mamba-based decoder

Scene-adaptive loss function

Voxel-to-point pattern learning

🔎 Similar Papers

No similar papers found.