Video Quality Assessment for Online Processing: From Spatial to Temporal Sampling

📅 2024-12-01
🏛️ IEEE transactions on circuits and systems for video technology (Print)
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses online video quality assessment (VQA), targeting the fundamental limits of spatiotemporal redundancy compression to achieve optimal trade-offs between efficiency and accuracy. We propose a joint spatiotemporal sampling strategy that reduces spatial resolution and frame rate to ≤10% of the original while incurring <8% performance degradation. The method comprises a lightweight spatial feature extractor, an efficient temporal fusion module, and a global quality regression network—forming a low-latency, real-time-capable VQA architecture. To our knowledge, this is the first study to systematically characterize the performance tolerance thresholds for spatiotemporal compression in VQA. Extensive experiments on six mainstream public benchmarks demonstrate strong generalization and robustness. Our approach establishes the first practical, high-accuracy, low-overhead solution for real-time VQA in edge-computing and streaming scenarios.

Technology Category

Application Category

📝 Abstract
With the rapid development of multimedia processing and deep learning technologies, especially in the field of video understanding, video quality assessment (VQA) has achieved significant progress. Although researchers have moved from designing efficient video quality mapping models to various research directions, in-depth exploration of the effectiveness-efficiency trade-offs of spatio-temporal modeling in VQA models is still less sufficient. Considering the fact that videos have highly redundant information, this paper investigates this problem from the perspective of joint spatial and temporal sampling, aiming to seek the answer to how little information we should keep at least when feeding videos into the VQA models while with acceptable performance sacrifice. To this end, we drastically sample the video’s information from both spatial and temporal dimensions, and the heavily squeezed video is then fed into a stable VQA model. Comprehensive experiments regarding joint spatial and temporal sampling are conducted on six public video quality databases, and the results demonstrate the acceptable performance of the VQA model when throwing away most of the video information. Furthermore, with the proposed joint spatial and temporal sampling strategy, we make an initial attempt to design an online VQA model, which is instantiated by as simple as possible a spatial feature extractor, a temporal feature fusion module, and a global quality regression module. Through quantitative and qualitative experiments, we verify the feasibility of online VQA model by simplifying itself and reducing input.
Problem

Research questions and friction points this paper is trying to address.

Video Quality Assessment
Temporal-Spatial Factors
Information Preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal-Spatial Sampling
Video Quality Assessment (VQA)
Computational Efficiency
🔎 Similar Papers
No similar papers found.
J
Jiebin Yan
School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330032, Jiangxi, China
L
Lei Wu
School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330032, Jiangxi, China
Yuming Fang
Yuming Fang
Jiangxi University of Finance and Economics
Image ProcessingVideo Processing3D Multimedia Processing
X
Xuelin Liu
School of Information Technology, Jiangxi University of Finance and Economics, Nanchang 330032, Jiangxi, China
Xue Xia
Xue Xia
Pinterest
Weide Liu
Weide Liu
Harvard University; Harvard Medical School
Machine LearningMedical Image Analysis