🤖 AI Summary
Existing 3D representations—including point clouds, meshes, and 3D Gaussians—struggle to simultaneously achieve high-fidelity rendering, accurate surface reconstruction, and high compression efficiency. To address this, we propose the Textured Voxellet Octree (TVO), a hierarchical, point-cloud-driven representation that approximates geometry using textured voxellets and explicitly encodes high-frequency texture details. TVO constructs an octree structure guided by input point clouds, associates each voxellet with a compact texture block, and introduces a joint geometry-texture entropy coding scheme. This design significantly reduces geometric primitives while improving both rendering fidelity and compression performance. Extensive evaluations on standard benchmarks demonstrate that TVO achieves superior PSNR, LPIPS, and subjective visual quality compared to state-of-the-art point cloud and 3D Gaussian methods—at substantially lower bitrates. TVO thus provides an efficient, general-purpose 3D representation suitable for remote 3D rendering and AR/VR streaming applications.
📝 Abstract
3D visual content streaming is a key technology for emerging 3D telepresence and AR/VR applications. One fundamental element underlying the technology is a versatile 3D representation that is capable of producing high-quality renders and can be efficiently compressed at the same time. Existing 3D representations like point clouds, meshes and 3D Gaussians each have limitations in terms of rendering quality, surface definition, and compressibility. In this paper, we present the Textured Surfel Octree (TeSO), a novel 3D representation that is built from point clouds but addresses the aforementioned limitations. It represents a 3D scene as cube-bounded surfels organized on an octree, where each surfel is further associated with a texture patch. By approximating a smooth surface with a large surfel at a coarser level of the octree, it reduces the number of primitives required to represent the 3D scene, and yet retains the high-frequency texture details through the texture map attached to each surfel. We further propose a compression scheme to encode the geometry and texture efficiently, leveraging the octree structure. The proposed textured surfel octree combined with the compression scheme achieves higher rendering quality at lower bit-rates compared to multiple point cloud and 3D Gaussian-based baselines.