🤖 AI Summary
Existing unsupervised point cloud recognition methods are constrained by unimodal feature modeling, struggling to jointly capture geometric structure and semantic meaning, and exhibit poor generalization under few-shot settings. To address this, we propose the first training-free dual-branch point cloud recognition framework that synergistically models geometry and semantics via non-parametric geometric feature extraction and CLIP-style text-point-cloud semantic alignment. We introduce two novel components: (i) a Geometric Feature Enhancement (GFE) module that strengthens structural representation through learnable geometric priors, and (ii) a Multi-domain Few-shot Feature Calibration (MFF) module that improves cross-domain generalization by adaptively aligning sparse support features. Extensive experiments on ModelNet-40 and ScanObjectNN demonstrate that our method consistently outperforms all existing training-free approaches, achieving new state-of-the-art performance in both zero-shot and few-shot recognition benchmarks.
📝 Abstract
The trend of employing training-free methods for point cloud recognition is becoming increasingly popular due to its significant reduction in computational resources and time costs. However, existing approaches are limited as they typically extract either geometric or semantic features. To address this limitation, we are the first to propose a novel training-free method that integrates both geometric and semantic features. For the geometric branch, we adopt a non-parametric strategy to extract geometric features. In the semantic branch, we leverage a model aligned with text features to obtain semantic features. Additionally, we introduce the GFE module to complement the geometric information of point clouds and the MFF module to improve performance in few-shot settings. Experimental results demonstrate that our method outperforms existing state-of-the-art training-free approaches on mainstream benchmark datasets, including ModelNet and ScanObiectNN.