Metamorphic Testing for Audio Content Moderation Software

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Audio content moderation systems are vulnerable to subtle adversarial perturbations—such as pitch shifting and noise injection—enabling evasion of detection for harmful content (e.g., hate speech, fraudulent advertisements). To address this, we introduce MTAM, the first metamorphic testing framework tailored for audio moderation. MTAM defines 14 cross-category, semantics-preserving audio transformation relations and leverages both audio-feature-driven and heuristic strategies to generate adversarial variants that retain toxicity while significantly increasing detection evasion rates. High-quality test cases are curated via pilot experiments. Evaluated on five commercial and one academic moderation system, MTAM achieves bug detection rates of 38.6%–51.1% (up to 45.7% for the academic model), substantially exposing critical robustness deficiencies. This work establishes a novel, systematic paradigm for evaluating audio content safety and advances rigorous, semantics-aware testing in audio AI security.

Technology Category

Application Category

📝 Abstract
The rapid growth of audio-centric platforms and applications such as WhatsApp and Twitter has transformed the way people communicate and share audio content in modern society. However, these platforms are increasingly misused to disseminate harmful audio content, such as hate speech, deceptive advertisements, and explicit material, which can have significant negative consequences (e.g., detrimental effects on mental health). In response, researchers and practitioners have been actively developing and deploying audio content moderation tools to tackle this issue. Despite these efforts, malicious actors can bypass moderation systems by making subtle alterations to audio content, such as modifying pitch or inserting noise. Moreover, the effectiveness of modern audio moderation tools against such adversarial inputs remains insufficiently studied. To address these challenges, we propose MTAM, a Metamorphic Testing framework for Audio content Moderation software. Specifically, we conduct a pilot study on 2000 audio clips and define 14 metamorphic relations across two perturbation categories: Audio Features-Based and Heuristic perturbations. MTAM applies these metamorphic relations to toxic audio content to generate test cases that remain harmful while being more likely to evade detection. In our evaluation, we employ MTAM to test five commercial textual content moderation software and an academic model against three kinds of toxic content. The results show that MTAM achieves up to 38.6%, 18.3%, 35.1%, 16.7%, and 51.1% error finding rates (EFR) when testing commercial moderation software provided by Gladia, Assembly AI, Baidu, Nextdata, and Tencent, respectively, and it obtains up to 45.7% EFR when testing the state-of-the-art algorithms from the academy.
Problem

Research questions and friction points this paper is trying to address.

Detecting harmful audio content evading moderation systems
Evaluating robustness of audio moderation tools against perturbations
Testing commercial and academic models for adversarial audio inputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Metamorphic testing framework for audio moderation
Defines 14 metamorphic relations across perturbation categories
Generates harmful test cases evading detection systems
🔎 Similar Papers
No similar papers found.
W
Wenxuan Wang
School of Information, Renmin University of China, China
Yongjiang Wu
Yongjiang Wu
Undergraduate of Computer Science, The Chinese University of Hong Kong
Trustworthy AILLM safetyDeep Learning
J
Junyuan Zhang
Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
Shuqing Li
Shuqing Li
The Chinese University of Hong Kong
Reliable Spatial IntelligenceMultimodal LLM AgentsXR (VR/AR/MR) SystemXR Security
Y
Yun Peng
Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
W
Wenting Chen
Department of Electrical Engineering, City University of Hong Kong, China
S
Shuai Wang
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, China
Michael R. Lyu
Michael R. Lyu
Professor of Computer Science & Engineering, The Chinese University of Hong Kong
software engineeringsoftware reliabilityfault tolerancemachine learningdistributed systems