🤖 AI Summary
Existing no-reference image quality assessment (IQA) methods and static datasets fail to model the dynamic temporal banding artifacts prevalent in highly compressed videos—especially in smooth regions. Method: We introduce LIVE-YT-Banding, the first open-source benchmark dataset specifically designed for video banding artifacts, comprising multi-resolution, multi-bitrate AV1-compressed videos with corresponding human subjective scores. Building upon it, we propose CBAND, a no-reference video quality assessment model that jointly leverages natural image statistics priors and deep feature embeddings, enabling end-to-end training and serving as a differentiable loss function for banding removal optimization. Contribution/Results: CBAND achieves significantly higher accuracy in banding artifact perception prediction compared to state-of-the-art methods, while accelerating inference by two to three orders of magnitude. The complete dataset, source code, and pre-trained models are publicly released.
📝 Abstract
Although there have been notable advancements in video compression technologies in recent years, banding artifacts remain a serious issue affecting the quality of compressed videos, particularly on smooth regions of high-definition videos. Noticeable banding artifacts can severely impact the perceptual quality of videos viewed on a high-end HDTV or high-resolution screen. Hence, there is a pressing need for a systematic investigation of the banding video quality assessment problem for advanced video codecs. Given that the existing publicly available datasets for studying banding artifacts are limited to still picture data only, which cannot account for temporal banding dynamics, we have created a first-of-a-kind open video dataset, dubbed LIVE-YT-Banding, which consists of 160 videos generated by four different compression parameters using the AV1 video codec. A total of 7,200 subjective opinions are collected from a cohort of 45 human subjects. To demonstrate the value of this new resources, we tested and compared a variety of models that detect banding occurrences, and measure their impact on perceived quality. Among these, we introduce an effective and efficient new no-reference (NR) video quality evaluator which we call CBAND. CBAND leverages the properties of the learned statistics of natural images expressed in the embeddings of deep neural networks. Our experimental results show that the perceptual banding prediction performance of CBAND significantly exceeds that of previous state-of-the-art models, and is also orders of magnitude faster. Moreover, CBAND can be employed as a differentiable loss function to optimize video debanding models. The LIVE-YT-Banding database, code, and pre-trained model are all publically available at https://github.com/uniqzheng/CBAND.