🤖 AI Summary
This study addresses the risk of cultural offense arising from AI systems’ misinterpretation of cross-cultural nonverbal gestures. We introduce MC-SIGNS, the first cross-cultural benchmark for evaluating gesture offensiveness—covering 25 gestures across 85 countries (288 gesture–country pairs). Leveraging multi-source human annotation, cross-cultural semantic alignment, and contextualized robustness testing, we systematically assess the cultural sensitivity of text-to-image (T2I) models, large language models (LLMs), and vision-language models (VLMs). Our analysis reveals a pronounced US-centric bias: T2I models exhibit a 37% accuracy drop in non-U.S. contexts; LLMs demonstrate excessive sensitivity (62% false positives); and VLMs generate U.S.-centric gestures in 91% of generic prompts. We propose a cultural-context-aware evaluation paradigm for nonverbal safety, extending AI fairness research beyond linguistic dimensions to embodied expression. This work establishes a foundational assessment framework for globally deployable, culturally responsible AI systems.
📝 Abstract
Gestures are an integral part of non-verbal communication, with meanings that vary across cultures, and misinterpretations that can have serious social and diplomatic consequences. As AI systems become more integrated into global applications, ensuring they do not inadvertently perpetuate cultural offenses is critical. To this end, we introduce Multi-Cultural Set of Inappropriate Gestures and Nonverbal Signs (MC-SIGNS), a dataset of 288 gesture-country pairs annotated for offensiveness, cultural significance, and contextual factors across 25 gestures and 85 countries. Through systematic evaluation using MC-SIGNS, we uncover critical limitations: text-to-image (T2I) systems exhibit strong US-centric biases, performing better at detecting offensive gestures in US contexts than in non-US ones; large language models (LLMs) tend to over-flag gestures as offensive; and vision-language models (VLMs) default to US-based interpretations when responding to universal concepts like wishing someone luck, frequently suggesting culturally inappropriate gestures. These findings highlight the urgent need for culturally-aware AI safety mechanisms to ensure equitable global deployment of AI technologies.