SynHate: Detecting Hate Speech in Synthetic Deepfake Audio

📅 2025-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of multilingual hate speech detection in deepfake audio. We introduce SynHate, the first synthetic-speech hate detection dataset covering 37 languages, and propose a novel four-dimensional fine-grained annotation framework—Real-normal, Real-hate, Fake-normal, Fake-hate—formally defining the multilingual hate speech detection task for synthetic speech. High-fidelity synthetic samples are generated using MuTox and ADIMA. We conduct a systematic evaluation of leading self-supervised speech models—including Whisper-small/medium, XLS-R, AST, and mHuBERT—on cross-lingual and cross-manipulation-domain generalization. Experiments reveal substantial performance disparities across languages and forgery domains, exposing critical generalization bottlenecks: while Whisper-small achieves top performance on most languages, all models suffer severe degradation under cross-dataset transfer. To foster reproducibility and further research, we publicly release the entire SynHate dataset and baseline code.

Technology Category

Application Category

📝 Abstract
The rise of deepfake audio and hate speech, powered by advanced text-to-speech, threatens online safety. We present SynHate, the first multilingual dataset for detecting hate speech in synthetic audio, spanning 37 languages. SynHate uses a novel four-class scheme: Real-normal, Real-hate, Fake-normal, and Fake-hate. Built from MuTox and ADIMA datasets, it captures diverse hate speech patterns globally and in India. We evaluate five leading self-supervised models (Whisper-small/medium, XLS-R, AST, mHuBERT), finding notable performance differences by language, with Whisper-small performing best overall. Cross-dataset generalization remains a challenge. By releasing SynHate and baseline code, we aim to advance robust, culturally sensitive, and multilingual solutions against synthetic hate speech. The dataset is available at https://www.iab-rubric.org/resources.
Problem

Research questions and friction points this paper is trying to address.

Detecting hate speech in synthetic deepfake audio
Creating multilingual dataset for synthetic hate speech detection
Evaluating model performance across diverse languages and hate speech patterns
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual dataset for synthetic hate speech detection
Novel four-class scheme for audio classification
Evaluates self-supervised models across 37 languages
🔎 Similar Papers
No similar papers found.
R
Rishabh Ranjan
Indian Institute of Technology Jodhpur, India
K
Kishan Pipariya
Pandit Deendayal Energy University, India
M
M. Vatsa
Indian Institute of Technology Jodhpur, India
Richa Singh
Richa Singh
Professor, IIT Jodhpur
BiometricsPattern RecognitionMachine LearningFace RecognitionDeep Learning