🤖 AI Summary
Existing image-guided part-level 3D generation methods suffer from coarse implicit segmentation granularity or reliance on strongly supervised external segmenters, hindering decomposable, structured fine-grained control. To address this, we propose Geom-Seg VecSet—a unified latent representation jointly encoding geometric and semantic part segmentation—and introduce the first two-stage part-aware diffusion framework. Our method leverages latent diffusion models and integrates geometric modeling, joint latent-variable segmentation, cross-space conditional generation (between global spatial and canonical part spaces), and precise image-3D alignment. It enables explicit, controllable part decomposition while preserving high-fidelity geometry. Quantitative and qualitative evaluations demonstrate significant improvements over state-of-the-art methods in part segmentation controllability and geometric fidelity. The framework supports fine-grained part editing and structured 3D synthesis, establishing a new paradigm for controllable, semantically grounded 3D generation from single images.
📝 Abstract
Part-level 3D generation is essential for applications requiring decomposable and structured 3D synthesis. However, existing methods either rely on implicit part segmentation with limited granularity control or depend on strong external segmenters trained on large annotated datasets. In this work, we observe that part awareness emerges naturally during whole-object geometry learning and propose Geom-Seg VecSet, a unified geometry-segmentation latent representation that jointly encodes object geometry and part-level structure. Building on this representation, we introduce UniPart, a two-stage latent diffusion framework for image-guided part-level 3D generation. The first stage performs joint geometry generation and latent part segmentation, while the second stage conditions part-level diffusion on both whole-object and part-specific latents. A dual-space generation scheme further enhances geometric fidelity by predicting part latents in both global and canonical spaces. Extensive experiments demonstrate that UniPart achieves superior segmentation controllability and part-level geometric quality compared with existing approaches.