🤖 AI Summary
Existing no-reference video quality assessment (NR-VQA) methods are designed for camera-captured videos and suffer significant performance degradation on rendered videos (e.g., gaming, VR), primarily due to their neglect of temporal artifacts. This work presents the first systematic study of NR-VQA for rendered content. We introduce RenderVQA—the first large-scale, multi-scenario, multi-rendering-setup dataset featuring subjective quality scores across diverse display types. To address the unique distortions introduced by temporal super-resolution and frame generation, we propose RQNet, a deep learning–based metric that jointly models spatial fidelity and temporal stability, explicitly capturing time-domain degradations. Experiments demonstrate that RQNet substantially outperforms state-of-the-art NR-VQA methods on RenderVQA (average PLCC improvement of 0.21). Moreover, it enables reliable benchmarking of super-resolution techniques and quantitative evaluation of frame-generation strategies, establishing a robust, deployable tool for real-time rendering quality analysis.
📝 Abstract
Quality assessment of videos is crucial for many computer graphics applications, including video games, virtual reality, and augmented reality, where visual performance has a significant impact on user experience. When test videos cannot be perfectly aligned with references or when references are unavailable, the significance of no-reference video quality assessment (NR-VQA) methods is undeniable. However, existing NR-VQA datasets and metrics are primarily focused on camera-captured videos; applying them directly to rendered videos would result in biased predictions, as rendered videos are more prone to temporal artifacts. To address this, we present a large rendering-oriented video dataset with subjective quality annotations, as well as a designed NR-VQA metric specific to rendered videos. The proposed dataset includes a wide range of 3D scenes and rendering settings, with quality scores annotated for various display types to better reflect real-world application scenarios. Building on this dataset, we calibrate our NR-VQA metric to assess rendered video quality by looking at both image quality and temporal stability. We compare our metric to existing NR-VQA metrics, demonstrating its superior performance on rendered videos. Finally, we demonstrate that our metric can be used to benchmark supersampling methods and assess frame generation strategies in real-time rendering.