🤖 AI Summary
To address the deployment challenge of 3D Gaussian Splatting (3DGS) on edge devices—stemming from its prohibitively large number of Gaussian primitives—this paper proposes a prototype-based lightweight representation. It replaces the massive set of original Gaussians with a small, learnable set of 3D Gaussian prototypes. Prototypes are constructed via SfM anchor-guided grouping and structured K-means clustering, and jointly optimized with anchors to ensure both geometric consistency and rendering fidelity. This work introduces the first Gaussian prototype learning paradigm. Evaluated on both real-world and synthetic datasets, it achieves over 90% reduction in Gaussian count and a 2.1× speedup in rendering, while surpassing existing compression methods in PSNR and SSIM. Crucially, it enables real-time novel-view synthesis on resource-constrained edge devices for the first time.
📝 Abstract
3D Gaussian Splatting (3DGS) has made significant strides in novel view synthesis but is limited by the substantial number of Gaussian primitives required, posing challenges for deployment on lightweight devices. Recent methods address this issue by compressing the storage size of densified Gaussians, yet fail to preserve rendering quality and efficiency. To overcome these limitations, we propose ProtoGS to learn Gaussian prototypes to represent Gaussian primitives, significantly reducing the total Gaussian amount without sacrificing visual quality. Our method directly uses Gaussian prototypes to enable efficient rendering and leverage the resulting reconstruction loss to guide prototype learning. To further optimize memory efficiency during training, we incorporate structure-from-motion (SfM) points as anchor points to group Gaussian primitives. Gaussian prototypes are derived within each group by clustering of K-means, and both the anchor points and the prototypes are optimized jointly. Our experiments on real-world and synthetic datasets prove that we outperform existing methods, achieving a substantial reduction in the number of Gaussians, and enabling high rendering speed while maintaining or even enhancing rendering fidelity.