🤖 AI Summary
This work addresses the challenge of jointly preserving global structure and local geometric detail in high-fidelity 3D shape generation. We propose a two-stage diffusion framework: the first stage generates coarse-grained voxelized global structures, while the second stage refines geometric details via spatially localized voxel queries and Rotary Position Embedding (RoPE) for precise spatial anchoring. Innovatively, we introduce watertightness-preserving preprocessing and a geometry-decoupled refinement mechanism to ensure topological integrity and surface fidelity under resource constraints. The method integrates 3D diffusion modeling, voxel-based representation, RoPE-enabled spatial localization, watertight mesh repair, and multi-level data augmentation. Trained solely on public 3D datasets, our approach achieves state-of-the-art geometric quality—significantly outperforming existing open-source methods—while supporting end-to-end high-quality generation and full open-source reproducibility.
📝 Abstract
In this report, we introduce UltraShape 1.0, a scalable 3D diffusion framework for high-fidelity 3D geometry generation. The proposed approach adopts a two-stage generation pipeline: a coarse global structure is first synthesized and then refined to produce detailed, high-quality geometry. To support reliable 3D generation, we develop a comprehensive data processing pipeline that includes a novel watertight processing method and high-quality data filtering. This pipeline improves the geometric quality of publicly available 3D datasets by removing low-quality samples, filling holes, and thickening thin structures, while preserving fine-grained geometric details. To enable fine-grained geometry refinement, we decouple spatial localization from geometric detail synthesis in the diffusion process. We achieve this by performing voxel-based refinement at fixed spatial locations, where voxel queries derived from coarse geometry provide explicit positional anchors encoded via RoPE, allowing the diffusion model to focus on synthesizing local geometric details within a reduced, structured solution space. Our model is trained exclusively on publicly available 3D datasets, achieving strong geometric quality despite limited training resources. Extensive evaluations demonstrate that UltraShape 1.0 performs competitively with existing open-source methods in both data processing quality and geometry generation. All code and trained models will be released to support future research.