Learning Perceptual Representations for Gaming NR-VQA with Multi-Task FR Signals

📅 2026-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of no-reference video quality assessment (NR-VQA) for game videos, which stem from the absence of human annotations and the inherent complexity of such content—including rapid motion, stylized graphics, and compression artifacts. To tackle these issues, the authors propose MTL-VQA, a multi-task learning framework that innovatively leverages full-reference (FR) quality metrics as self-supervised signals to pretrain perceptual features without requiring human labels. By adaptively weighting and jointly optimizing multiple FR objectives, the model learns transferable shared representations suitable for NR-VQA. Experimental results demonstrate that the proposed method achieves performance on par with state-of-the-art NR-VQA approaches across various settings—including MOS-supervised, label-efficient, and fully self-supervised scenarios—on game video datasets.

Technology Category

Application Category

📝 Abstract
No-reference video quality assessment (NR-VQA) for gaming videos is challenging due to limited human-rated datasets and unique content characteristics including fast motion, stylized graphics, and compression artifacts. We present MTL-VQA, a multi-task learning framework that uses full-reference metrics as supervisory signals to learn perceptually meaningful features without human labels for pretraining. By jointly optimizing multiple full-reference (FR) objectives with adaptive task weighting, our approach learns shared representations that transfer effectively to NR-VQA. Experiments on gaming video datasets show MTL-VQA achieves performance competitive with state-of-the-art NR-VQA methods across both MOS-supervised and label-efficient/self-supervised settings.
Problem

Research questions and friction points this paper is trying to address.

no-reference video quality assessment
gaming videos
perceptual representation
full-reference metrics
multi-task learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-task learning
no-reference VQA
full-reference supervision
perceptual representation
gaming video quality
🔎 Similar Papers
No similar papers found.
Y
Yu-Chih Chen
National Yang Ming Chiao Tung University; The University of Texas at Austin
M
Michael Wang
National Yang Ming Chiao Tung University
C
Chieh-Dun Wen
National Yang Ming Chiao Tung University
K
Kai-Siang Ma
National Yang Ming Chiao Tung University
Avinab Saha
Avinab Saha
Research Scientist, Google Research
Machine LearningVisual PerceptionGenerative AIRLHFSelf-Supervised Learning
Li-Heng Chen
Li-Heng Chen
Netflix Inc.
image and video qualityvideo codingmachine learning
Alan Bovik
Alan Bovik
Provost’s Endowed Chair Professor, University of Colorado Boulder
Computational VisionVisual NeuroscienceMultimediaAugmented RealityVisual Perception