Segment Any Mesh: Zero-shot Mesh Part Segmentation via Lifting Segment Anything 2 to 3D

📅 2024-08-24
🏛️ arXiv.org
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitations of existing mesh part segmentation methods—such as reliance on category-specific annotations and poor generalization to unseen categories—this paper introduces the first zero-shot 3D mesh part segmentation framework. Methodologically, it jointly renders multi-view surface normals and Shape Diameter Functions (SDF) to generate 2D images, leverages the Segment Anything Model (SAM) to obtain cross-view 2D masks, and achieves part-level 3D segmentation via geometric-consistency-based mask aggregation and 2D-to-3D lifting. Key contributions include: (i) the first adaptation of a 2D vision foundation model for zero-shot transfer to 3D mesh segmentation; (ii) zero-training generalization to novel categories; and (iii) plug-and-play compatibility with upgraded SAM variants. Quantitative evaluation on standard and custom benchmarks shows segmentation accuracy competitive with or superior to conventional SDF-based methods. Human evaluation further confirms significantly improved semantic consistency across parts and stronger cross-shape generalization.

Technology Category

Application Category

📝 Abstract
We propose Segment Any Mesh, a novel zero-shot mesh part segmentation method that overcomes the limitations of shape analysis-based, learning-based, and contemporary approaches. Our approach operates in two phases: multimodal rendering and 2D-to-3D lifting. In the first phase, multiview renders of the mesh are individually processed through Segment Anything to generate 2D masks. These masks are then lifted into a mesh part segmentation by associating masks that refer to the same mesh part across the multiview renders. We find that applying Segment Anything to multimodal feature renders of normals and shape diameter scalars achieves better results than using only untextured renders of meshes. By building our method on top of Segment Anything, we seamlessly inherit any future improvements made to 2D segmentation. We compare our method with a robust, well-evaluated shape analysis method, Shape Diameter Function, and show that our method is comparable to or exceeds its performance. Since current benchmarks contain limited object diversity, we also curate and release a dataset of generated meshes and use it to demonstrate our method's improved generalization over Shape Diameter Function via human evaluation. We release the code and dataset at https://github.com/gtangg12/samesh
Problem

Research questions and friction points this paper is trying to address.

Zero-shot mesh part segmentation overcoming existing limitations
Multimodal rendering and 2D-to-3D lifting for improved segmentation
Generalization demonstrated via curated dataset and human evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot mesh part segmentation method
Multimodal rendering and 2D-to-3D lifting
Improved generalization via curated dataset
🔎 Similar Papers
No similar papers found.
G
George Tang
MIT CSAIL, Backflip AI
W
William Zhao
MIT CSAIL
L
Logan Ford
Backflip AI
D
David Benhaim
Backflip AI
Paul Zhang
Paul Zhang
Backflip AI