Region-Adaptive Learned Hierarchical Encoding for 3D Gaussian Splatting Data

📅 2025-10-26

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

To address the large model size and deployment challenges of 3D Gaussian Splatting (3DGS) in bandwidth-constrained scenarios—such as volumetric media streaming—this paper proposes a region-adaptive hierarchical learning-based coding framework. It introduces, for the first time, an overfitting-aware learned compression paradigm for 3DGS data, constructing an octree-driven, geometry-aware multi-resolution latent space. A spatial-context-aware autoregressive entropy model is further designed to jointly optimize distortion and rate for irregular Gaussian attributes. The framework enables end-to-end joint optimization of latent representations, a lightweight decoding network, and global bit-rate constraints. Under a strict budget of <1 MB, our method achieves up to a 2 dB PSNR gain over baseline methods, significantly improving both compression efficiency and reconstruction fidelity.

Technology Category

Application Category

📝 Abstract

We introduce Region-Adaptive Learned Hierarchical Encoding (RALHE) for 3D Gaussian Splatting (3DGS) data. While 3DGS has recently become popular for novel view synthesis, the size of trained models limits its deployment in bandwidth-constrained applications such as volumetric media streaming. To address this, we propose a learned hierarchical latent representation that builds upon the principles of "overfitted" learned image compression (e.g., Cool-Chic and C3) to efficiently encode 3DGS attributes. Unlike images, 3DGS data have irregular spatial distributions of Gaussians (geometry) and consist of multiple attributes (signals) defined on the irregular geometry. Our codec is designed to account for these differences between images and 3DGS. Specifically, we leverage the octree structure of the voxelized 3DGS geometry to obtain a hierarchical multi-resolution representation. Our approach overfits latents to each Gaussian attribute under a global rate constraint. These latents are decoded independently through a lightweight decoder network. To estimate the bitrate during training, we employ an autoregressive probability model that leverages octree-derived contexts from the 3D point structure. The multi-resolution latents, decoder, and autoregressive entropy coding networks are jointly optimized for each Gaussian attribute. Experiments demonstrate that the proposed RALHE compression framework achieves a rendering PSNR gain of up to 2dB at low bitrates (less than 1 MB) compared to the baseline 3DGS compression methods.

Problem

Research questions and friction points this paper is trying to address.

Compresses 3D Gaussian Splatting models to reduce bandwidth usage

Encodes irregular 3DGS geometry and attributes using hierarchical representation

Improves rendering quality at low bitrates compared to existing methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Learned hierarchical encoding for 3D Gaussian attributes

Octree-based multi-resolution latent representation

Autoregressive entropy coding with joint optimization

🔎 Similar Papers

No similar papers found.