Seeing Beyond 8bits: Subjective and Objective Quality Assessment of HDR-UGC Videos

๐Ÿ“… 2026-03-01
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing video quality assessment methods are primarily designed for standard dynamic range (SDR) content and struggle to effectively model distortions unique to high dynamic range (HDR) user-generated content (UGC), such as near-black crushing, highlight clipping, color banding, and exposure flickering. To address this gap, this work presents Beyond8Bitsโ€”the first large-scale subjective quality dataset for HDR-UGCโ€”and introduces HDR-Q, the first multimodal large language model tailored for HDR-UGC quality assessment. HDR-Q integrates an HDR-perception-aware visual encoder with a reinforcement learning-based HDR-Aware Policy Optimization (HAPO) framework, enhanced by contrastive KL regularization, Gaussian-weighted regression rewards, and crowd rating modeling to accurately capture HDR-specific distortions. Extensive evaluations on Beyond8Bits and public HDR-VQA benchmarks demonstrate that HDR-Q significantly outperforms existing approaches, achieving state-of-the-art performance.

Technology Category

Application Category

๐Ÿ“ Abstract
High Dynamic Range (HDR) user-generated (UGC) videos are rapidly proliferating across social platforms, yet most perceptual video quality assessment (VQA) systems remain tailored to Standard Dynamic Range (SDR). HDR has a higher bit depth, wide color gamut, and elevated luminance range, exposing distortions such as near-black crushing, highlight clipping, banding, and exposure flicker that amplify UGC artifacts and challenge SDR models. To catalyze progress, we curate Beyond8Bits, a large-scale subjective dataset of 44K videos from 6.5K sources with over 1.5M crowd ratings, spanning diverse scenes, capture conditions, and compression settings. We further introduce HDR-Q, the first Multimodal Large Language Model (MLLM) for HDR-UGC VQA. We propose (i) a novel HDR-aware vision encoder to produce HDR-sensitive embeddings, and (ii) HDR-Aware Policy Optimization (HAPO), an RL finetuning framework that anchors reasoning to HDR cues. HAPO augments GRPO via an HDR-SDR contrastive KL that encourages token reliance on HDR inputs and a Gaussian weighted regression reward for fine-grained MOS calibration. Across Beyond8Bits and public HDR-VQA benchmarks, HDR-Q delivers state-of-the-art performance.
Problem

Research questions and friction points this paper is trying to address.

HDR
UGC
video quality assessment
perceptual distortion
dynamic range
Innovation

Methods, ideas, or system contributions that make the work stand out.

HDR-Q
HDR-aware vision encoder
HDR-Aware Policy Optimization
Beyond8Bits
Multimodal Large Language Model
๐Ÿ”Ž Similar Papers
No similar papers found.