From One to More: Contextual Part Latents for 3D Generation

📅 2025-07-11

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Current 3D generative methods face three key bottlenecks: (1) monolithic latent representations struggle to capture multi-part geometric details; (2) global implicit encodings neglect part-level independence and structural relationships; and (3) conditional control lacks fine-grained editability. To address these, we propose CoPart—the first part-aware 3D diffusion generation framework. CoPart introduces a context-aware latent variable decomposition mechanism that disentangles objects into semantically consistent, relation-modelable part-level latent variables. Our method integrates a 3D-native implicit diffusion architecture, automatic mesh segmentation, part-specific conditional encoding, and mutual-guidance fine-tuning. We further construct Partverse, a large-scale part-annotated 3D dataset. Experiments demonstrate that CoPart significantly outperforms prior approaches in part-level editing, articulated structure generation, and scene composition—achieving substantial improvements in geometric fidelity, structural coherence, and user-controllable precision.

Technology Category

Application Category

📝 Abstract

Recent advances in 3D generation have transitioned from multi-view 2D rendering approaches to 3D-native latent diffusion frameworks that exploit geometric priors in ground truth data. Despite progress, three key limitations persist: (1) Single-latent representations fail to capture complex multi-part geometries, causing detail degradation; (2) Holistic latent coding neglects part independence and interrelationships critical for compositional design; (3) Global conditioning mechanisms lack fine-grained controllability. Inspired by human 3D design workflows, we propose CoPart - a part-aware diffusion framework that decomposes 3D objects into contextual part latents for coherent multi-part generation. This paradigm offers three advantages: i) Reduces encoding complexity through part decomposition; ii) Enables explicit part relationship modeling; iii) Supports part-level conditioning. We further develop a mutual guidance strategy to fine-tune pre-trained diffusion models for joint part latent denoising, ensuring both geometric coherence and foundation model priors. To enable large-scale training, we construct Partverse - a novel 3D part dataset derived from Objaverse through automated mesh segmentation and human-verified annotations. Extensive experiments demonstrate CoPart's superior capabilities in part-level editing, articulated object generation, and scene composition with unprecedented controllability.

Problem

Research questions and friction points this paper is trying to address.

Single-latent representations degrade complex multi-part geometries

Holistic latent coding ignores part independence and interrelationships

Global conditioning lacks fine-grained controllability for 3D generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Part-aware diffusion framework for 3D generation

Decomposes objects into contextual part latents

Mutual guidance strategy for joint denoising

🔎 Similar Papers

GaussianBlock: Building Part-Aware Compositional and Editable 3D Scene by Primitives and Gaussians