Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address detail loss and modality mismatch in two-stage VAE-diffusion approaches for high-resolution 3D shape modeling—caused by unstructured mesh representations and voxel grid sparsity—the paper introduces Sparcubes, the first unified sparse framework. Methodologically, it integrates: (1) a sparse, deformable Marching Cubes representation enabling differentiable, high-fidelity surface reconstruction from signed distance fields; (2) a fully sparse convolutional variational autoencoder (Sparconv-VAE) for modality-consistent, near-lossless latent compression; and (3) an end-to-end sparse diffusion generative pipeline. The framework enables faithful reconstruction at 1024³ resolution, significantly improving fidelity on open surfaces, disconnected components, and intricate geometries. It reduces training and inference overhead while achieving state-of-the-art performance in 3D generative modeling.

Technology Category

Application Category

📝 Abstract
High-fidelity 3D object synthesis remains significantly more challenging than 2D image generation due to the unstructured nature of mesh data and the cubic complexity of dense volumetric grids. Existing two-stage pipelines-compressing meshes with a VAE (using either 2D or 3D supervision), followed by latent diffusion sampling-often suffer from severe detail loss caused by inefficient representations and modality mismatches introduced in VAE. We introduce Sparc3D, a unified framework that combines a sparse deformable marching cubes representation Sparcubes with a novel encoder Sparconv-VAE. Sparcubes converts raw meshes into high-resolution ($1024^3$) surfaces with arbitrary topology by scattering signed distance and deformation fields onto a sparse cube, allowing differentiable optimization. Sparconv-VAE is the first modality-consistent variational autoencoder built entirely upon sparse convolutional networks, enabling efficient and near-lossless 3D reconstruction suitable for high-resolution generative modeling through latent diffusion. Sparc3D achieves state-of-the-art reconstruction fidelity on challenging inputs, including open surfaces, disconnected components, and intricate geometry. It preserves fine-grained shape details, reduces training and inference cost, and integrates naturally with latent diffusion models for scalable, high-resolution 3D generation.
Problem

Research questions and friction points this paper is trying to address.

High-fidelity 3D object synthesis is challenging due to unstructured mesh data and cubic complexity.
Existing methods suffer from detail loss due to inefficient representations and modality mismatches.
SparC introduces a unified framework for efficient, high-resolution 3D reconstruction and generative modeling.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse deformable marching cubes for high-resolution surfaces
Modality-consistent VAE with sparse convolutional networks
Efficient near-lossless 3D reconstruction for diffusion models
🔎 Similar Papers
No similar papers found.