From Native Memes to Global Moderation: Cross-Cultural Evaluation of Vision-Language Models for Hateful Meme Detection

📅 2026-02-07

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This study addresses the limited cross-cultural fairness and robustness of current vision-language models in detecting hateful memes, stemming from training data that predominantly reflects Western contexts. The authors develop a systematic evaluation framework to analyze the performance of mainstream models on multilingual memes across three dimensions: learning strategies, prompt language, and translation effects, uncovering significant performance degradation when detection follows machine translation. To mitigate the models’ systematic bias toward Western safety norms, they propose a culturally aligned intervention combining native-language prompting with one-shot learning. Experimental results demonstrate that this approach substantially improves cross-cultural detection accuracy, offering a practical pathway toward globally robust multimodal content moderation systems.

Technology Category

Application Category

📝 Abstract

Cultural context profoundly shapes how people interpret online content, yet vision-language models (VLMs) remain predominantly trained through Western or English-centric lenses. This limits their fairness and cross-cultural robustness in tasks like hateful meme detection. We introduce a systematic evaluation framework designed to diagnose and quantify the cross-cultural robustness of state-of-the-art VLMs across multilingual meme datasets, analyzing three axes: (i) learning strategy (zero-shot vs. one-shot), (ii) prompting language (native vs. English), and (iii) translation effects on meaning and detection. Results show that the common ``translate-then-detect''approach deteriorate performance, while culturally aligned interventions - native-language prompting and one-shot learning - significantly enhance detection. Our findings reveal systematic convergence toward Western safety norms and provide actionable strategies to mitigate such bias, guiding the design of globally robust multimodal moderation systems.

Problem

Research questions and friction points this paper is trying to address.

cross-cultural robustness

hateful meme detection

vision-language models

cultural bias

multimodal moderation

Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-cultural robustness

vision-language models

hateful meme detection