ZeroPS: High-quality Cross-modal Knowledge Transfer for Zero-Shot 3D Part Segmentation

📅 2023-11-24

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 2

career value

190K/year

🤖 AI Summary

This work addresses zero-shot 3D point cloud part segmentation without fine-tuning or domain adaptation. Methodologically, it introduces the first cross-modal knowledge transfer framework that synergistically integrates multi-view geometric projection with prompting mechanisms from 2D foundation models (SAM and GLIP): SAM enables interactive 2D segmentation prompts, while GLIP’s open-vocabulary grounding capability guides semantic alignment between 2D regions and 3D parts under co-visibility constraints. Robust 2D→3D knowledge distillation is achieved via multi-view consistency regularization, ensuring strong generalization across simulation-to-real settings and domain shifts. Evaluated on PartNetE and AKBSeg benchmarks, the method significantly outperforms state-of-the-art approaches in three zero-shot tasks—part segmentation, unsupervised segmentation, and instance segmentation—achieving new best results in all.

📝 Abstract

Zero-shot 3D part segmentation is a challenging and fundamental task. In this work, we propose a novel pipeline, ZeroPS, which achieves high-quality knowledge transfer from 2D pretrained foundation models (FMs), SAM and GLIP, to 3D object point clouds. We aim to explore the natural relationship between multi-view correspondence and the FMs' prompt mechanism and build bridges on it. In ZeroPS, the relationship manifests as follows: 1) lifting 2D to 3D by leveraging co-viewed regions and SAM's prompt mechanism, 2) relating 1D classes to 3D parts by leveraging 2D-3D view projection and GLIP's prompt mechanism, and 3) enhancing prediction performance by leveraging multi-view observations. Extensive evaluations on the PartNetE and AKBSeg benchmarks demonstrate that ZeroPS significantly outperforms the SOTA method across zero-shot unlabeled and instance segmentation tasks. ZeroPS does not require additional training or fine-tuning for the FMs. ZeroPS applies to both simulated and real-world data. It is hardly affected by domain shift. The project page is available at https://luis2088.github.io/ZeroPS_page.

Problem

Research questions and friction points this paper is trying to address.

Zero-shot 3D part segmentation

Cross-modal knowledge transfer

2D to 3D lifting using SAM and GLIP

Innovation

Methods, ideas, or system contributions that make the work stand out.

2D to 3D knowledge transfer

Multi-view correspondence exploration

No additional training required

🔎 Similar Papers

No similar papers found.