🤖 AI Summary
To address key bottlenecks in fine-grained 3D shape classification—including weak discriminability of multi-view features, severe class imbalance, and poor model interpretability—this paper proposes the first prototype-based non-parametric learning paradigm. Methodologically, it introduces prototype learning to this task for the first time, designing a prototype association mechanism, an online clustering refinement strategy, and prototype–view correlation-driven interpretable supervision to enable transparent, case-level reasoning. The technical framework integrates multi-view feature alignment, dynamic prototype evolution, fine-grained correlation modeling, and non-parametric prototype classification. Extensive experiments demonstrate significant improvements over state-of-the-art methods on FG3D and ModelNet40, achieving substantial accuracy gains while enhancing prediction confidence and visual interpretability.
📝 Abstract
Deep learning-based multi-view coarse-grained 3D shape classification has achieved remarkable success over the past decade, leveraging the powerful feature learning capabilities of CNN-based and ViT-based backbones. However, as a challenging research area critical for detailed shape understanding, fine-grained 3D classification remains understudied due to the limited discriminative information captured during multi-view feature aggregation, particularly for subtle inter-class variations, class imbalance, and inherent interpretability limitations of parametric model. To address these problems, we propose the first prototype-based framework named Proto-FG3D for fine-grained 3D shape classification, achieving a paradigm shift from parametric softmax to non-parametric prototype learning. Firstly, Proto-FG3D establishes joint multi-view and multi-category representation learning via Prototype Association. Secondly, prototypes are refined via Online Clustering, improving both the robustness of multi-view feature allocation and inter-subclass balance. Finally, prototype-guided supervised learning is established to enhance fine-grained discrimination via prototype-view correlation analysis and enables ad-hoc interpretability through transparent case-based reasoning. Experiments on FG3D and ModelNet40 show Proto-FG3D surpasses state-of-the-art methods in accuracy, transparent predictions, and ad-hoc interpretability with visualizations, challenging conventional fine-grained 3D recognition approaches.