🤖 AI Summary
To address the excessive storage overhead and deployment challenges of 3D Gaussian Splatting (3DGS) in man-made scenes, this paper proposes a “sketch-color block” dual-stream Gaussian representation. It encodes boundary-sensitive Sketch Gaussians via parametric geometric models for structural awareness, while compressing smooth-region Patch Gaussians through adaptive pruning, incremental retraining, and vector quantization (VQ). This work is the first to introduce the artistic “sketch-first, then color” paradigm into 3D Gaussian modeling, establishing a semantics-driven, differential compression framework. Experiments demonstrate that, at equal model capacity, our method achieves 32.62% higher PSNR, 19.12% higher SSIM, and 45.41% lower LPIPS compared to baseline 3DGS. Moreover, for indoor scenes, it retains rendering fidelity using only 2.3% of the original model’s storage volume.
📝 Abstract
3D Gaussian Splatting (3DGS) has emerged as a promising representation for photorealistic rendering of 3D scenes. However, its high storage requirements pose significant challenges for practical applications. We observe that Gaussians exhibit distinct roles and characteristics that are analogous to traditional artistic techniques -- Like how artists first sketch outlines before filling in broader areas with color, some Gaussians capture high-frequency features like edges and contours; While other Gaussians represent broader, smoother regions, that are analogous to broader brush strokes that add volume and depth to a painting. Based on this observation, we propose a novel hybrid representation that categorizes Gaussians into (i) Sketch Gaussians, which define scene boundaries, and (ii) Patch Gaussians, which cover smooth regions. Sketch Gaussians are efficiently encoded using parametric models, leveraging their geometric coherence, while Patch Gaussians undergo optimized pruning, retraining, and vector quantization to maintain volumetric consistency and storage efficiency. Our comprehensive evaluation across diverse indoor and outdoor scenes demonstrates that this structure-aware approach achieves up to 32.62% improvement in PSNR, 19.12% in SSIM, and 45.41% in LPIPS at equivalent model sizes, and correspondingly, for an indoor scene, our model maintains the visual quality with 2.3% of the original model size.