🤖 AI Summary
Existing autoregressive methods for artist-style 3D mesh generation struggle to simultaneously preserve global structural consistency and local geometric fidelity, while suffering from error accumulation. To address this, we propose a part-aware discrete diffusion framework: first, semantic segmentation decomposes the mesh into semantically meaningful parts; then, we jointly model inter-part dependencies via autoregression and intra-part geometry via parallel diffusion, incorporating a part-aware cross-attention mechanism that explicitly disentangles global topology from high-frequency geometric details. Point clouds serve as fine-grained geometric conditioning, and the entire pipeline is implemented as an end-to-end generative model based on the DiT architecture. Experiments demonstrate substantial improvements over state-of-the-art methods across diverse 3D mesh generation tasks—achieving superior detail fidelity, robust structural coherence, and practical viability for industrial-scale artistic design applications.
📝 Abstract
Existing autoregressive (AR) methods for generating artist-designed meshes struggle to balance global structural consistency with high-fidelity local details, and are susceptible to error accumulation. To address this, we propose PartDiffuser, a novel semi-autoregressive diffusion framework for point-cloud-to-mesh generation. The method first performs semantic segmentation on the mesh and then operates in a "part-wise" manner: it employs autoregression between parts to ensure global topology, while utilizing a parallel discrete diffusion process within each semantic part to precisely reconstruct high-frequency geometric features. PartDiffuser is based on the DiT architecture and introduces a part-aware cross-attention mechanism, using point clouds as hierarchical geometric conditioning to dynamically control the generation process, thereby effectively decoupling the global and local generation tasks. Experiments demonstrate that this method significantly outperforms state-of-the-art (SOTA) models in generating 3D meshes with rich detail, exhibiting exceptional detail representation suitable for real-world applications.