🤖 AI Summary
Existing masked point cloud modeling (MPM) approaches predominantly rely on coordinate or feature regression, which tends to overfit local geometric details and undermines semantic generalization. To address this, we propose a clustering-driven MPM framework. First, we introduce a geometry-aware patching strategy that preserves local structural priors. Second, we design a teacher–student architecture integrating online K-means clustering with dynamic codebook updating, explicitly aligning masked-region feature distributions to cluster centroids—thereby decoupling detail reconstruction from semantic abstraction. By replacing regression-based constraints with clustering-based distribution alignment, our method encourages learning of more robust and generalizable point cloud representations. Extensive experiments demonstrate consistent and significant improvements across downstream tasks, including classification, part segmentation, and 3D object detection. The implementation is publicly available.
📝 Abstract
Most masked point cloud modeling (MPM) methods follow a regression paradigm to reconstruct the coordinate or feature of masked regions. However, they tend to over-constrain the model to learn the details of the masked region, resulting in failure to capture generalized features. To address this limitation, we propose extbf{ extit{PointGAC}}, a novel clustering-based MPM method that aims to align the feature distribution of masked regions. Specially, it features an online codebook-guided teacher-student framework. Firstly, it presents a geometry-aware partitioning strategy to extract initial patches. Then, the teacher model updates a codebook via online k-means based on features extracted from the complete patches. This procedure facilitates codebook vectors to become cluster centers. Afterward, we assigns the unmasked features to their corresponding cluster centers, and the student model aligns the assignment for the reconstructed masked features. This strategy focuses on identifying the cluster centers to which the masked features belong, enabling the model to learn more generalized feature representations. Benefiting from a proposed codebook maintenance mechanism, codebook vectors are actively updated, which further increases the efficiency of semantic feature learning. Experiments validate the effectiveness of the proposed method on various downstream tasks. Code is available at https://github.com/LAB123-tech/PointGAC