🤖 AI Summary
To address the high memory footprint and low compression efficiency of 3D Gaussian Splatting (3DGS), this paper proposes the first end-to-end optimized hierarchical transform coding framework. Unlike existing approaches—e.g., Scaffold-GS—that rely on complex context modeling for entropy coding, our method pioneers sparse-guided hierarchical transforms for 3DGS compression: it jointly optimizes geometric representations, analysis-synthesis transforms based on interpretable KLT bases, and a lightweight signal-prior-driven context model integrated with ISTA-unrolled residual reconstruction. The resulting framework achieves superior rate-distortion performance while maintaining low parameter count and computational overhead. Experiments demonstrate over 30% bitrate reduction at equivalent rendering quality compared to state-of-the-art methods including Scaffold-GS, validating both efficacy and efficiency.
📝 Abstract
3D Gaussian Splatting (3DGS) has gained popularity for its fast and high-quality rendering, but it has a very large memory footprint incurring high transmission and storage overhead. Recently, some neural compression methods, such as Scaffold-GS, were proposed for 3DGS but they did not adopt the approach of end-to-end optimized analysis-synthesis transforms which has been proven highly effective in neural signal compression. Without an appropriate analysis transform, signal correlations cannot be removed by sparse representation. Without such transforms the only way to remove signal redundancies is through entropy coding driven by a complex and expensive context modeling, which results in slower speed and suboptimal rate-distortion (R-D) performance. To overcome this weakness, we propose Sparsity-guided Hierarchical Transform Coding (SHTC), the first end-to-end optimized transform coding framework for 3DGS compression. SHTC jointly optimizes the 3DGS, transforms and a lightweight context model. This joint optimization enables the transform to produce representations that approach the best R-D performance possible. The SHTC framework consists of a base layer using KLT for data decorrelation, and a sparsity-coded enhancement layer that compresses the KLT residuals to refine the representation. The enhancement encoder learns a linear transform to project high-dimensional inputs into a low-dimensional space, while the decoder unfolds the Iterative Shrinkage-Thresholding Algorithm (ISTA) to reconstruct the residuals. All components are designed to be interpretable, allowing the incorporation of signal priors and fewer parameters than black-box transforms. This novel design significantly improves R-D performance with minimal additional parameters and computational overhead.