🤖 AI Summary
Existing 3D generative models suffer from insufficient geometric detail under single-reference input, low training efficiency, high computational resource consumption, and poor adaptability to diverse surface types. To address these limitations, we propose a multi-scale sparse point-voxel diffusion architecture. Our method integrates sparse voxel grid representation, joint sampling of point cloud normals and colors, multi-scale neural feature extraction, and parallelized diffusion modeling. It is the first approach enabling efficient generation of fine-grained geometric variants and real-time human–computer collaborative editing. Experiments demonstrate that our method significantly improves surface detail fidelity, supports generalized modeling of irregular surfaces, accelerates training by 2.3×, reduces GPU memory consumption by 41%, and achieves millisecond-level latency in interactive design tasks.
📝 Abstract
This paper proposes ShapeShifter, a new 3D generative model that learns to synthesize shape variations based on a single reference model. While generative methods for 3D objects have recently attracted much attention, current techniques often lack geometric details and/or require long training times and large resources. Our approach remedies these issues by combining sparse voxel grids and point, normal, and color sampling within a multiscale neural architecture that can be trained efficiently and in parallel. We show that our resulting variations better capture the fine details of their original input and can handle more general types of surfaces than previous SDF-based methods. Moreover, we offer interactive generation of 3D shape variants, allowing more human control in the design loop if needed.