Multi-Branch Collaborative Learning Network for Video Quality Assessment in Industrial Video Search

📅 2025-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses four prevalent low-quality issues in AI-generated videos—visual artifacts, textual anomalies, inter-frame incoherence, and image-text misalignment—within industrial-scale video search. We propose a multi-branch collaborative quality assessment network featuring a novel four-branch disentangled architecture: visual, title, OCR, and semantic consistency branches. To integrate branch-specific signals, we introduce a Squeeze-and-Excitation-based dynamic weighting mechanism and design a joint point-wise and pair-wise loss function to ensure both rating stability and discriminative validity. Deployed in a billion-scale video search system, our method achieves significant improvements in detection accuracy for low-quality AI videos and markedly enhances ranking quality over baselines. Ablation studies rigorously validate the contribution of each branch. This work delivers a scalable, interpretable, and fine-grained quality-aware solution for large-scale video retrieval.

Technology Category

Application Category

📝 Abstract
Video Quality Assessment (VQA) is vital for large-scale video retrieval systems, aimed at identifying quality issues to prioritize high-quality videos. In industrial systems, low-quality video characteristics fall into four categories: visual-related issues like mosaics and black boxes, textual issues from video titles and OCR content, and semantic issues like frame incoherence and frame-text mismatch from AI-generated videos. Despite their prevalence in industrial settings, these low-quality videos have been largely overlooked in academic research, posing a challenge for accurate identification. To address this, we introduce the Multi-Branch Collaborative Network (MBCN) tailored for industrial video retrieval systems. MBCN features four branches, each designed to tackle one of the aforementioned quality issues. After each branch independently scores videos, we aggregate these scores using a weighted approach and a squeeze-and-excitation mechanism to dynamically address quality issues across different scenarios. We implement point-wise and pair-wise optimization objectives to ensure score stability and reasonableness. Extensive offline and online experiments on a world-level video search engine demonstrate MBCN's effectiveness in identifying video quality issues, significantly enhancing the retrieval system's ranking performance. Detailed experimental analyses confirm the positive contribution of all four evaluation branches. Furthermore, MBCN significantly improves recognition accuracy for low-quality AI-generated videos compared to the baseline.
Problem

Research questions and friction points this paper is trying to address.

Identify video quality issues
Enhance industrial video retrieval
Improve AI-generated video recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Branch Collaborative Network
Weighted score aggregation
Squeeze-and-excitation mechanism
🔎 Similar Papers
No similar papers found.
Hengzhu Tang
Hengzhu Tang
Baidu Inc.
Information RetrievalVision-Language Pre-trainingVLLMs
Zefeng Zhang
Zefeng Zhang
Institute of Information Engineering,Chinese Academy of Sciences
Natural Language Processing
Z
Zhiping Li
Baidu Inc., Beijing, China
Z
Zhenyu Zhang
Baidu Inc., Beijing, China
X
Xing Wu
Baidu Inc., Beijing, China
L
Li Gao
Baidu Inc., Beijing, China
S
Suqi Cheng
Baidu Inc., Beijing, China
Dawei Yin
Dawei Yin
Senior Director, Head of Search Science at Baidu
Machine LearningWeb MiningData Mining