Multilingual Hate Speech Detection in Social Media Using Translation-Based Approaches with Large Language Models

📅 2025-06-09

📈 Citations: 0

✨ Influential: 0

career value

187K/year

🤖 AI Summary

This study addresses the lack of hate speech detection resources for the low-resource language Urdu. We introduce the first balanced English–Urdu–Spanish tweet dataset (10,193 instances). To tackle cross-lingual transfer, we propose a translation-augmented, attention-guided large language model (LLM) framework: (i) an attention layer is inserted before the LLM to pre-extract multilingual features, and (ii) a novel LLM-based translation-aligned joint training scheme is designed. Experiments show that the joint multilingual model achieves a macro-F1 of 0.88—outperforming an SVM baseline by 7.32%. Urdu-specific performance reaches 0.81 (+5.19%), while English and Spanish attain 0.87 and 0.85, respectively. This work fills a critical gap in multilingual hate speech research by establishing the first dedicated Urdu benchmark and methodology, and advances cross-lingual modeling for low-resource languages through a principled, attention-enhanced, translation-aware LLM paradigm.

Technology Category

Application Category

📝 Abstract

Social media platforms are critical spaces for public discourse, shaping opinions and community dynamics, yet their widespread use has amplified harmful content, particularly hate speech, threatening online safety and inclusivity. While hate speech detection has been extensively studied in languages like English and Spanish, Urdu remains underexplored, especially using translation-based approaches. To address this gap, we introduce a trilingual dataset of 10,193 tweets in English (3,834 samples), Urdu (3,197 samples), and Spanish (3,162 samples), collected via keyword filtering, with a balanced distribution of 4,849 Hateful and 5,344 Not-Hateful labels. Our methodology leverages attention layers as a precursor to transformer-based models and large language models (LLMs), enhancing feature extraction for multilingual hate speech detection. For non-transformer models, we use TF-IDF for feature extraction. The dataset is benchmarked using state-of-the-art models, including GPT-3.5 Turbo and Qwen 2.5 72B, alongside traditional machine learning models like SVM and other transformers (e.g., BERT, RoBERTa). Three annotators, following rigorous guidelines, ensured high dataset quality, achieving a Fleiss' Kappa of 0.821. Our approach, integrating attention layers with GPT-3.5 Turbo and Qwen 2.5 72B, achieves strong performance, with macro F1 scores of 0.87 for English (GPT-3.5 Turbo), 0.85 for Spanish (GPT-3.5 Turbo), 0.81 for Urdu (Qwen 2.5 72B), and 0.88 for the joint multilingual model (Qwen 2.5 72B). These results reflect improvements of 8.75% in English (over SVM baseline 0.80), 8.97% in Spanish (over SVM baseline 0.78), 5.19% in Urdu (over SVM baseline 0.77), and 7.32% in the joint multilingual model (over SVM baseline 0.82). Our framework offers a robust solution for multilingual hate speech detection, fostering safer digital communities worldwide.

Problem

Research questions and friction points this paper is trying to address.

Detect hate speech in Urdu using translation-based approaches.

Improve multilingual hate speech detection with large language models.

Address data imbalance in trilingual hate speech datasets.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses translation-based multilingual hate speech detection

Leverages attention layers with transformer models

Benchmarks GPT-3.5 Turbo and Qwen 2.5 72B

🔎 Similar Papers

Cross-lingual Offensive Language Detection: A Systematic Review of Datasets, Transfer Approaches and Challenges