🤖 AI Summary
To address the prohibitive computational and memory overhead of full-parameter fine-tuning for point cloud self-supervised pre-trained models, this paper proposes a lightweight and efficient fine-tuning method. We pioneer the integration of Low-Rank Adaptation (LoRA) with a multi-scale token selection mechanism within point cloud Transformer architectures. Specifically, LoRA compresses the parameter space via low-rank matrix decomposition, while a multi-scale importance scoring scheme dynamically selects discriminative local tokens, preserving fine-grained geometric structures while enhancing global representation learning. Evaluated on three standard point cloud benchmarks, our method achieves over 98% of the performance of full-parameter fine-tuning using only 3.43% trainable parameters. This substantial reduction in parameter count significantly improves deployment efficiency and scalability—particularly under resource-constrained conditions—without compromising model accuracy.
📝 Abstract
Self-supervised representation learning for point cloud has demonstrated effectiveness in improving pre-trained model performance across diverse tasks. However, as pre-trained models grow in complexity, fully fine-tuning them for downstream applications demands substantial computational and storage resources. Parameter-efficient fine-tuning (PEFT) methods offer a promising solution to mitigate these resource requirements, yet most current approaches rely on complex adapter and prompt mechanisms that increase tunable parameters. In this paper, we propose PointLoRA, a simple yet effective method that combines low-rank adaptation (LoRA) with multi-scale token selection to efficiently fine-tune point cloud models. Our approach embeds LoRA layers within the most parameter-intensive components of point cloud transformers, reducing the need for tunable parameters while enhancing global feature capture. Additionally, multi-scale token selection extracts critical local information to serve as prompts for downstream fine-tuning, effectively complementing the global context captured by LoRA. The experimental results across various pre-trained models and three challenging public datasets demonstrate that our approach achieves competitive performance with only 3.43% of the trainable parameters, making it highly effective for resource-constrained applications. Source code is available at: https://github.com/songw-zju/PointLoRA.