StruMamba3D: Exploring Structural Mamba for Self-supervised Point Cloud Representation Learning

📅 2025-06-26

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

To address two critical limitations of Mamba in point cloud representation learning—distortion of 3D geometric adjacency and degradation of long-sequence memory—this paper proposes Spatial-Mamba. First, it introduces a spatial state proxy mechanism that explicitly encodes local geometric adjacency within the State Space Model (SSM). Second, it incorporates a state-level update strategy coupled with a lightweight convolutional interaction module to enhance structural awareness. Third, it designs a sequence-length adaptive mechanism to mitigate long-range dependency decay. Evaluated on standard benchmarks, Spatial-Mamba achieves 95.1% accuracy on ModelNet40 and 92.75% on the hardest split of ScanObjectNN (without voting). It establishes new state-of-the-art performance across four downstream tasks and marks the first successful application of SSM architectures to efficient and robust self-supervised learning for point clouds.

Technology Category

Application Category

📝 Abstract

Recently, Mamba-based methods have demonstrated impressive performance in point cloud representation learning by leveraging State Space Model (SSM) with the efficient context modeling ability and linear complexity. However, these methods still face two key issues that limit the potential of SSM: Destroying the adjacency of 3D points during SSM processing and failing to retain long-sequence memory as the input length increases in downstream tasks. To address these issues, we propose StruMamba3D, a novel paradigm for self-supervised point cloud representation learning. It enjoys several merits. First, we design spatial states and use them as proxies to preserve spatial dependencies among points. Second, we enhance the SSM with a state-wise update strategy and incorporate a lightweight convolution to facilitate interactions between spatial states for efficient structure modeling. Third, our method reduces the sensitivity of pre-trained Mamba-based models to varying input lengths by introducing a sequence length-adaptive strategy. Experimental results across four downstream tasks showcase the superior performance of our method. In addition, our method attains the SOTA 95.1% accuracy on ModelNet40 and 92.75% accuracy on the most challenging split of ScanObjectNN without voting strategy.

Problem

Research questions and friction points this paper is trying to address.

Preserving spatial dependencies among 3D points in SSM processing

Retaining long-sequence memory as input length increases

Reducing sensitivity of Mamba models to varying input lengths

Innovation

Methods, ideas, or system contributions that make the work stand out.

Spatial states preserve 3D point dependencies

State-wise update enhances SSM for structure modeling

Sequence length-adaptive strategy improves input flexibility

🔎 Similar Papers

Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud