🤖 AI Summary
This work addresses four prevalent low-quality issues in AI-generated videos—visual artifacts, textual anomalies, inter-frame incoherence, and image-text misalignment—within industrial-scale video search. We propose a multi-branch collaborative quality assessment network featuring a novel four-branch disentangled architecture: visual, title, OCR, and semantic consistency branches. To integrate branch-specific signals, we introduce a Squeeze-and-Excitation-based dynamic weighting mechanism and design a joint point-wise and pair-wise loss function to ensure both rating stability and discriminative validity. Deployed in a billion-scale video search system, our method achieves significant improvements in detection accuracy for low-quality AI videos and markedly enhances ranking quality over baselines. Ablation studies rigorously validate the contribution of each branch. This work delivers a scalable, interpretable, and fine-grained quality-aware solution for large-scale video retrieval.
📝 Abstract
Video Quality Assessment (VQA) is vital for large-scale video retrieval systems, aimed at identifying quality issues to prioritize high-quality videos. In industrial systems, low-quality video characteristics fall into four categories: visual-related issues like mosaics and black boxes, textual issues from video titles and OCR content, and semantic issues like frame incoherence and frame-text mismatch from AI-generated videos. Despite their prevalence in industrial settings, these low-quality videos have been largely overlooked in academic research, posing a challenge for accurate identification. To address this, we introduce the Multi-Branch Collaborative Network (MBCN) tailored for industrial video retrieval systems. MBCN features four branches, each designed to tackle one of the aforementioned quality issues. After each branch independently scores videos, we aggregate these scores using a weighted approach and a squeeze-and-excitation mechanism to dynamically address quality issues across different scenarios. We implement point-wise and pair-wise optimization objectives to ensure score stability and reasonableness. Extensive offline and online experiments on a world-level video search engine demonstrate MBCN's effectiveness in identifying video quality issues, significantly enhancing the retrieval system's ranking performance. Detailed experimental analyses confirm the positive contribution of all four evaluation branches. Furthermore, MBCN significantly improves recognition accuracy for low-quality AI-generated videos compared to the baseline.