Hyper3D: Efficient 3D Representation via Hybrid Triplane and Octree Feature for Enhanced 3D Shape Variational Auto-Encoders

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low reconstruction fidelity and severe geometric detail loss in 3D variational autoencoders (VAEs), this paper proposes a hybrid implicit representation integrating tri-planes and octrees. The method embeds an octree structure into the VAE encoder—marking the first such integration—to enable non-uniform surface-aware encoding and explicit 3D topological modeling. It further introduces a hybrid latent space combining tri-planes with a low-resolution voxel grid, balancing global shape coherence and local geometric detail. Geometric-aware sampling and multi-scale tri-plane representations are incorporated to enhance surface fidelity. Evaluated on ShapeNet and other benchmarks, the approach achieves significant improvements: +2.1 dB in PSNR and −38% reduction in Chamfer distance. It enables high-fidelity reconstruction of complex topologies and fine-grained geometry, establishing a robust latent foundation for high-quality 3D diffusion-based generation.

Technology Category

Application Category

📝 Abstract
Recent 3D content generation pipelines often leverage Variational Autoencoders (VAEs) to encode shapes into compact latent representations, facilitating diffusion-based generation. Efficiently compressing 3D shapes while preserving intricate geometric details remains a key challenge. Existing 3D shape VAEs often employ uniform point sampling and 1D/2D latent representations, such as vector sets or triplanes, leading to significant geometric detail loss due to inadequate surface coverage and the absence of explicit 3D representations in the latent space. Although recent work explores 3D latent representations, their large scale hinders high-resolution encoding and efficient training. Given these challenges, we introduce Hyper3D, which enhances VAE reconstruction through efficient 3D representation that integrates hybrid triplane and octree features. First, we adopt an octree-based feature representation to embed mesh information into the network, mitigating the limitations of uniform point sampling in capturing geometric distributions along the mesh surface. Furthermore, we propose a hybrid latent space representation that integrates a high-resolution triplane with a low-resolution 3D grid. This design not only compensates for the lack of explicit 3D representations but also leverages a triplane to preserve high-resolution details. Experimental results demonstrate that Hyper3D outperforms traditional representations by reconstructing 3D shapes with higher fidelity and finer details, making it well-suited for 3D generation pipelines.
Problem

Research questions and friction points this paper is trying to address.

Efficient 3D shape compression preserving geometric details.
Overcoming limitations of uniform point sampling in VAEs.
Enhancing 3D reconstruction fidelity with hybrid triplane-octree features.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid triplane and octree feature integration
Octree-based feature for mesh surface detail
High-resolution triplane with low-resolution 3D grid
🔎 Similar Papers
No similar papers found.