Semantic Compression of 3D Objects for Open and Collaborative Virtual Worlds

📅 2025-05-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional 3D compression methods suffer from severe structural degradation—including texture distortion, mesh collapse, and interstitial gaps—under extreme compression ratios (e.g., 100×). To address this, we propose a novel semantic compression paradigm that abandons geometric fidelity entirely, instead encoding 3D objects as natural language descriptions and leveraging large foundation models for spatial reconstruction. This work introduces the first end-to-end, semantics-driven 3D reconstruction framework wherein natural language serves as the sole compression medium. Our approach integrates CLIP-based feature alignment, text-conditioned diffusion modeling, and geometry-aware prior guidance. Evaluated on Objaverse, it achieves up to 105× compression ratio and significantly outperforms geometric codecs (e.g., Draco) at the critical 100× regime. Crucially, the reconstructed outputs are human-readable, editable, and collaboration-friendly—enabling efficient 3D content distribution and co-creation in open virtual worlds.

Technology Category

Application Category

📝 Abstract
Traditional methods for 3D object compression operate only on structural information within the object vertices, polygons, and textures. These methods are effective at compression rates up to 10x for standard object sizes but quickly deteriorate at higher compression rates with texture artifacts, low-polygon counts, and mesh gaps. In contrast, semantic compression ignores structural information and operates directly on the core concepts to push to extreme levels of compression. In addition, it uses natural language as its storage format, which makes it natively human-readable and a natural fit for emerging applications built around large-scale, collaborative projects within augmented and virtual reality. It deprioritizes structural information like location, size, and orientation and predicts the missing information with state-of-the-art deep generative models. In this work, we construct a pipeline for 3D semantic compression from public generative models and explore the quality-compression frontier for 3D object compression. We apply this pipeline to achieve rates as high as 105x for 3D objects taken from the Objaverse dataset and show that semantic compression can outperform traditional methods in the important quality-preserving region around 100x compression.
Problem

Research questions and friction points this paper is trying to address.

Overcoming limitations of traditional 3D compression methods at high rates
Enabling extreme compression by focusing on semantic concepts over structure
Using natural language for human-readable 3D object storage in VR/AR
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic compression ignores structural information
Uses natural language as human-readable storage format
Predicts missing data with deep generative models
🔎 Similar Papers
2024-05-23arXiv.orgCitations: 0