Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching

📅 2025-12-03

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

To address the challenges of dynamic, multimodal, and hard-to-detect harmful content in live-streaming scenarios, this paper proposes a dual-channel hybrid moderation framework: (1) a supervised classification channel and (2) an MLLM-enhanced reference-based similarity matching channel. Leveraging knowledge distillation from multimodal large language models (text, audio, and vision), the framework achieves lightweight inference while jointly supporting both known policy violation detection and discovery of novel, covert malicious behaviors. The classification channel achieves 67% recall at 80% precision; the similarity channel attains 76% recall. Large-scale A/B testing demonstrates a 6–8% reduction in user exposure to harmful live streams. The core innovation lies in a dynamically coordinated mechanism integrating supervised learning with unsupervised similarity matching, significantly improving robustness, generalization, and scalability.

Technology Category

Application Category

📝 Abstract

Content moderation remains a critical yet challenging task for large-scale user-generated video platforms, especially in livestreaming environments where moderation must be timely, multimodal, and robust to evolving forms of unwanted content. We present a hybrid moderation framework deployed at production scale that combines supervised classification for known violations with reference-based similarity matching for novel or subtle cases. This hybrid design enables robust detection of both explicit violations and novel edge cases that evade traditional classifiers. Multimodal inputs (text, audio, visual) are processed through both pipelines, with a multimodal large language model (MLLM) distilling knowledge into each to boost accuracy while keeping inference lightweight. In production, the classification pipeline achieves 67% recall at 80% precision, and the similarity pipeline achieves 76% recall at 80% precision. Large-scale A/B tests show a 6-8% reduction in user views of unwanted livestreams}. These results demonstrate a scalable and adaptable approach to multimodal content governance, capable of addressing both explicit violations and emerging adversarial behaviors.

Problem

Research questions and friction points this paper is trying to address.

Detects explicit violations and novel edge cases in livestream content moderation

Processes multimodal inputs (text, audio, visual) for robust moderation

Addresses evolving unwanted content in real-time livestreaming environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid framework combining supervised classification and similarity matching

Multimodal large language model boosting accuracy and keeping inference lightweight

Scalable approach reducing unwanted livestream views by 6-8%

🔎 Similar Papers

No similar papers found.