Mantis: Mamba-native Tuning is Efficient for 3D Point Cloud Foundation Models

📅 2026-05-05

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Existing parameter-efficient fine-tuning (PEFT) methods struggle to adapt to Mamba-based foundational models for 3D point clouds, often resulting in performance degradation and training instability. To address this, this work proposes the first PEFT framework tailored to the native Mamba architecture, introducing a State-Aware Adapter (SAA) that enables task adaptation at the state level. Furthermore, a Dual-Serialization Consistency Distillation (DSCD) mechanism is designed to enhance optimization stability by aligning representations across different serialization orders. Requiring only approximately 5% trainable parameters, the proposed method achieves competitive performance across multiple 3D point cloud benchmarks, effectively resolving the core challenge of mismatch between the dynamic nature of state-space models and conventional adaptation granularities.

📝 Abstract

Pre-trained 3D point cloud foundation models (PFMs) have demonstrated strong transferability across diverse downstream tasks. However, full fine-tuning these models is computationally expensive and storage-intensive. Parameter-efficient fine-tuning (PEFT) offers a promising alternative, but existing PEFT approaches are primarily designed for Transformer-based backbones and rely on token-level prompting or feature transformation. Mamba-based backbones introduce a granularity mismatch between token-level adaptation and state-level sequence dynamics. Consequently, straightforward transfer of existing PEFT approaches to frozen Mamba backbones leads to substantial accuracy degradation and unstable optimization. To address this issue, we propose Mantis, the first Mamba-native PEFT framework for 3D PFMs. Specifically, a State-Aware Adapter (SAA) is introduced to inject lightweight task-conditioned control signals into selective state-space updates, enabling state-level adaptation while keeping the pre-trained backbone frozen. Moreover, different valid point cloud serializations are regularized by Dual-Serialization Consistency Distillation (DSCD), thereby reducing serialization-induced instability. Extensive experiments across multiple benchmarks demonstrate that our Mantis achieves competitive performance with only about 5% trainable parameters. Our code is available at https://github.com/gzhhhhhhh/Mantis.

Problem

Research questions and friction points this paper is trying to address.

Parameter-Efficient Fine-Tuning

Mamba

3D Point Cloud Foundation Models

State-Space Models

Granularity Mismatch

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mamba

Parameter-Efficient Fine-Tuning

State-Space Model