๐ค AI Summary
Implicit hate speech (im-HS) poses significant challenges for automated detection due to its semantic opacity and strong contextual dependency. To address this, we propose the first six-dimensional codetype encoding taxonomy, systematically modeling both semantic and rhetorical characteristics of im-HS. Crucially, we integrate this encoding strategy deeply into large language model (LLM) prompt engineering and semantic embeddingโmarking the first such fusion of codetype guidance with LLM-based inference. We develop a cross-lingual, prompt-driven encoding detection framework grounded in this principle. Evaluated on benchmark Chinese and English im-HS datasets, our method achieves absolute F1-score improvements of 4.2โ7.8 percentage points over state-of-the-art baselines. These results robustly demonstrate the effectiveness and cross-lingual generalizability of codetype-guided LLMs for implicit hate speech identification.
๐ Abstract
The internet has become a hotspot for hate speech (HS), threatening societal harmony and individual well-being. While automatic detection methods perform well in identifying explicit hate speech (ex-HS), they struggle with more subtle forms, such as implicit hate speech (im-HS). We tackle this problem by introducing a new taxonomy for im-HS detection, defining six encoding strategies named codetypes. We present two methods for integrating codetypes into im-HS detection: 1) prompting large language models (LLMs) directly to classify sentences based on generated responses, and 2) using LLMs as encoders with codetypes embedded during the encoding process. Experiments show that the use of codetypes improves im-HS detection in both Chinese and English datasets, validating the effectiveness of our approach across different languages.