GeoSAM2: Unleashing the Power of SAM2 for 3D Part Segmentation

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing 3D generation methods suffer from insufficient geometric detail under sparse or single-view inputs and lack efficient zero-shot 3D part segmentation frameworks. To address this, we introduce the first adaptation of the vision foundation model SAM2 to 3D part segmentation. Our approach proposes a spatially aware prompting mechanism and a geometry-guided prompt propagation–mask fusion framework. It integrates multi-view point cloud projection, view-aligned prompting, confidence-weighted mask fusion, and geometric consistency optimization—enabling end-to-end zero-shot segmentation without any 3D annotation. Evaluated on ShapeNetPart and PartNet benchmarks, our method significantly outperforms existing open-source models, particularly preserving higher geometric fidelity at complex part boundaries. This work establishes a scalable, part-level parsing paradigm for 3D understanding and editing.

Technology Category

Application Category

📝 Abstract
Modern 3D generation methods can rapidly create shapes from sparse or single views, but their outputs often lack geometric detail due to computational constraints. We present DetailGen3D, a generative approach specifically designed to enhance these generated 3D shapes. Our key insight is to model the coarse-to-fine transformation directly through data-dependent flows in latent space, avoiding the computational overhead of large-scale 3D generative models. We introduce a token matching strategy that ensures accurate spatial correspondence during refinement, enabling local detail synthesis while preserving global structure. By carefully designing our training data to match the characteristics of synthesized coarse shapes, our method can effectively enhance shapes produced by various 3D generation and reconstruction approaches, from single-view to sparse multi-view inputs. Extensive experiments demonstrate that DetailGen3D achieves high-fidelity geometric detail synthesis while maintaining efficiency in training.
Problem

Research questions and friction points this paper is trying to address.

Enhancing geometric detail in generated 3D shapes
Modeling coarse-to-fine transformation in latent space
Synthesizing local details while preserving global structure
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data-dependent latent space flows
Token matching for spatial correspondence
Training data matching coarse shapes