Dynamic Content Moderation in Livestreams: Combining Supervised Classification with MLLM-Boosted Similarity Matching

📅 2025-12-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of dynamic, multimodal, and hard-to-detect harmful content in live-streaming scenarios, this paper proposes a dual-channel hybrid moderation framework: (1) a supervised classification channel and (2) an MLLM-enhanced reference-based similarity matching channel. Leveraging knowledge distillation from multimodal large language models (text, audio, and vision), the framework achieves lightweight inference while jointly supporting both known policy violation detection and discovery of novel, covert malicious behaviors. The classification channel achieves 67% recall at 80% precision; the similarity channel attains 76% recall. Large-scale A/B testing demonstrates a 6–8% reduction in user exposure to harmful live streams. The core innovation lies in a dynamically coordinated mechanism integrating supervised learning with unsupervised similarity matching, significantly improving robustness, generalization, and scalability.

Technology Category

Application Category

📝 Abstract
Content moderation remains a critical yet challenging task for large-scale user-generated video platforms, especially in livestreaming environments where moderation must be timely, multimodal, and robust to evolving forms of unwanted content. We present a hybrid moderation framework deployed at production scale that combines supervised classification for known violations with reference-based similarity matching for novel or subtle cases. This hybrid design enables robust detection of both explicit violations and novel edge cases that evade traditional classifiers. Multimodal inputs (text, audio, visual) are processed through both pipelines, with a multimodal large language model (MLLM) distilling knowledge into each to boost accuracy while keeping inference lightweight. In production, the classification pipeline achieves 67% recall at 80% precision, and the similarity pipeline achieves 76% recall at 80% precision. Large-scale A/B tests show a 6-8% reduction in user views of unwanted livestreams}. These results demonstrate a scalable and adaptable approach to multimodal content governance, capable of addressing both explicit violations and emerging adversarial behaviors.
Problem

Research questions and friction points this paper is trying to address.

Detects explicit violations and novel edge cases in livestream content moderation
Processes multimodal inputs (text, audio, visual) for robust moderation
Addresses evolving unwanted content in real-time livestreaming environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid framework combining supervised classification and similarity matching
Multimodal large language model boosting accuracy and keeping inference lightweight
Scalable approach reducing unwanted livestream views by 6-8%
🔎 Similar Papers
No similar papers found.
W
Wei Chee Yew
TikTok, Singapore, Singapore
H
Hailun Xu
TikTok, Singapore, Singapore
Sanjay Saha
Sanjay Saha
PhD in Computer Science, National University of Singapore
Computer VisionBiometricsMachine Learning
X
Xiaotian Fan
TikTok, Singapore, Singapore
H
Hiok Hian Ong
TikTok, Singapore, Singapore
D
David Yuchen Wang
TikTok, Singapore, Singapore
K
Kanchan Sarkar
TikTok, Singapore, Singapore
Zhenheng Yang
Zhenheng Yang
TikTok
Computer VisionMachine LearningDeep Learning
D
Danhui Guan
TikTok, Singapore, Singapore