🤖 AI Summary
This work addresses the redundancy and poor compressibility of 3D Gaussian Splatting (3DGS) representations. To this end, we propose a compact encoding framework that jointly leverages spatial structure and semantic features. Methodologically, we design a recursive voxel hierarchy to explicitly model the spatial distribution of Gaussians and introduce a lightweight feature abstraction network that jointly encodes multi-attribute semantic properties—including color, opacity, covariance, and material—into a discrete, learnable, compact representation. The framework supports end-to-end training and preserves both rendering fidelity and view consistency while achieving substantial compression gains. Experiments demonstrate state-of-the-art compression performance across multiple standard benchmarks, reducing storage overhead by an average of 62%. Moreover, the learned representation naturally enables downstream semantic editing and material transfer tasks, offering both high compression ratio and strong generalization capability.
📝 Abstract
We present Smol-GS, a novel method for learning compact representations for 3D Gaussian Splatting (3DGS). Our approach learns highly efficient encodings in 3D space that integrate both spatial and semantic information. The model captures the coordinates of the splats through a recursive voxel hierarchy, while splat-wise features store abstracted cues, including color, opacity, transformation, and material properties. This design allows the model to compress 3D scenes by orders of magnitude without loss of flexibility. Smol-GS achieves state-of-the-art compression on standard benchmarks while maintaining high rendering quality. Beyond visual fidelity, the discrete representations could potentially serve as a foundation for downstream tasks such as navigation, planning, and broader 3D scene understanding.