🤖 AI Summary
Existing point cloud analysis methods struggle to effectively model discriminative features within complex neighborhoods, often leading to significant information loss in deeper layers. To address this, this work proposes PointCRA, a novel network that introduces channel-wise relational modeling and neighborhood homogeneity constraints for the first time, establishing a multi-level calibration framework. A dedicated discriminative loss function is designed to enhance inter-channel distinctiveness, while temporal trend variation is incorporated as a new evaluation dimension to mitigate weight collapse in spatial-channel attention mechanisms. With minimal parameter overhead, PointCRA achieves high interpretability and strong transferability, delivering state-of-the-art performance on S3DIS (77.5% mIoU), ScanObjectNN (90.4% overall accuracy), and ShapeNetPart (87.4% instance mIoU).
📝 Abstract
In 3D point cloud understanding, the core challenge lies in accurately capturing discriminative features within complex neighborhoods, which directly affects the execution precision of downstream tasks such as embodied AI and autonomous driving. Existing methods explore feature correlation discrimination but are limited to point-level spatial distribution or channel responses, enabling only coarse-grained level evaluation. For modern multi-scale point cloud networks, such coarse-grained metrics inevitably incur significant information loss in deeper layers. To address this issue, we propose a novel network equipped with a channel-level metric-based enhancement mechanism, termed the PointCRA network. Our core idea is to introduce temporal trend variation as a new evaluation dimension to avoid the information loss caused by weight dimension collapse in existing spatial and channel attention mechanisms. On this basis, we construct a multi-level calibration framework guided by neighborhood homogeneity for weight calibration, and design a dedicated loss function to enhance channel discriminability. The module effectively leverages the intrinsic feature priors of deep networks to adaptively correct the feature aggregation process, offering strong interpretability with low parameter overhead. Furthermore, our proposed method exhibits strong transferability, interpretability, and parameter efficiency. We validate the proposed method effectiveness on diverse datasets and benchmark models, and further demonstrate its rationality through extensive analytical experiments. Our PointCRA achieves 77.5% mIoU on the S3DIS dataset, 90.4% OA on the ScanObjectNN dataset, and 87.4% instance mIoU on the ShapeNetPart dataset. The code and pretrained weights are publicly available on GitHub: