🤖 AI Summary
Cross-dataset inconsistency in part definitions and lack of semantic naming hinder generalization in 3D part segmentation.
Method: We propose the first end-to-end semantic naming segmentation framework, introducing *partlets*—learnable implicit part representations—and a geometric-visual-linguistic multimodal alignment mechanism. Our approach jointly optimizes implicit part fields, multi-view features, and LLM-generated functional descriptions, achieving fine-grained alignment via bipartite matching and text-geometry joint embedding.
Contributions/Results: (1) The first unified part ontology covering PartNet, 3DCoMPaT++, and Find3D (1,794 categories); (2) Zero-shot semantic naming with calibrated confidence scoring; (3) Two dedicated evaluation metrics and the new Tex-Parts benchmark. Our method achieves state-of-the-art performance on PartNet and other major benchmarks.
📝 Abstract
We address semantic 3D part segmentation: decomposing objects into parts with meaningful names. While datasets exist with part annotations, their definitions are inconsistent across datasets, limiting robust training. Previous methods produce unlabeled decompositions or retrieve single parts without complete shape annotations. We propose ALIGN-Parts, which formulates part naming as a direct set alignment task. Our method decomposes shapes into partlets - implicit 3D part representations - matched to part descriptions via bipartite assignment. We combine geometric cues from 3D part fields, appearance from multi-view vision features, and semantic knowledge from language-model-generated affordance descriptions. Text-alignment loss ensures partlets share embedding space with text, enabling a theoretically open-vocabulary matching setup, given sufficient data. Our efficient and novel, one-shot, 3D part segmentation and naming method finds applications in several downstream tasks, including serving as a scalable annotation engine. As our model supports zero-shot matching to arbitrary descriptions and confidence-calibrated predictions for known categories, with human verification, we create a unified ontology that aligns PartNet, 3DCoMPaT++, and Find3D, consisting of 1,794 unique 3D parts. We also show examples from our newly created Tex-Parts dataset. We also introduce 2 novel metrics appropriate for the named 3D part segmentation task.