🤖 AI Summary
This work addresses the lack of systematic quality assessment for Gaussian Splatting (GS)-based novel view synthesis. We present the first comprehensive subjective and objective quality benchmark specifically designed for static GS-rendered videos. Methodologically, we construct an open-source benchmark covering both 360° and forward-facing scenes, comprising video sequences generated by dozens of state-of-the-art GS methods (e.g., 3DGS, GaussianGroup) alongside corresponding human perceptual scores (Mean Opinion Scores, MOS). We systematically evaluate the correlation between MOS and 18 objective metrics—including PSNR, LPIPS, DISTS, and CLIP-IQA. Key contributions are: (1) the first publicly available, GS-specific subjective–objective quality benchmark; (2) empirical evidence that depth-aware metrics (e.g., CLIP-IQA) exhibit superior consistency with human perception compared to conventional image-fidelity metrics; and (3) a reproducible, standardized foundation for guiding GS algorithm development and objective metric design.
📝 Abstract
Gaussian Splatting (GS) offers a promising alternative to Neural Radiance Fields (NeRF) for real-time 3D scene rendering. Using a set of 3D Gaussians to represent complex geometry and appearance, GS achieves faster rendering times and reduced memory consumption compared to the neural network approach used in NeRF. However, quality assessment of GS-generated static content is not yet explored in-depth. This paper describes a subjective quality assessment study that aims to evaluate synthesized videos obtained with several static GS state-of-the-art methods. The methods were applied to diverse visual scenes, covering both 360-degree and forward-facing (FF) camera trajectories. Moreover, the performance of 18 objective quality metrics was analyzed using the scores resulting from the subjective study, providing insights into their strengths, limitations, and alignment with human perception. All videos and scores are made available providing a comprehensive database that can be used as benchmark on GS view synthesis and objective quality metrics.