Cultural Compass: A Framework for Organizing Societal Norms to Detect Violations in Human-AI Conversations

πŸ“… 2026-01-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the frequent generation of unsafe or culturally inappropriate content by generative AI in cross-cultural interactions, stemming from insufficient modeling and evaluation of social norms. To bridge this gap, the work proposes the first multidimensional taxonomy that integrates normative context, domain specificity, and enforcement mechanisms, distinguishing between interpersonal norms and human–AI interaction norms. Building upon this framework, the authors develop a pipeline grounded in norm ontology modeling, natural language understanding, and automated evaluation to enable context-sensitive detection of norm violations in open-domain dialogue. Empirical analysis reveals that mainstream large language models commonly violate cultural norms, with violation rates significantly influenced by model architecture, country of origin, interaction context, and prompt intent.

Technology Category

Application Category

πŸ“ Abstract
Generative AI models ought to be useful and safe across cross-cultural contexts. One critical step toward this goal is understanding how AI models adhere to sociocultural norms. While this challenge has gained attention in NLP, existing work lacks both nuance and coverage in understanding and evaluating models'norm adherence. We address these gaps by introducing a taxonomy of norms that clarifies their contexts (e.g., distinguishing between human-human norms that models should recognize and human-AI interactional norms that apply to the human-AI interaction itself), specifications (e.g., relevant domains), and mechanisms (e.g., modes of enforcement). We demonstrate how our taxonomy can be operationalized to automatically evaluate models'norm adherence in naturalistic, open-ended settings. Our exploratory analyses suggest that state-of-the-art models frequently violate norms, though violation rates vary by model, interactional context, and country. We further show that violation rates also vary by prompt intent and situational framing. Our taxonomy and demonstrative evaluation pipeline enable nuanced, context-sensitive evaluation of cultural norm adherence in realistic settings.
Problem

Research questions and friction points this paper is trying to address.

sociocultural norms
norm violation
cross-cultural AI
human-AI interaction
generative AI safety
Innovation

Methods, ideas, or system contributions that make the work stand out.

cultural norms
taxonomy
human-AI interaction
norm violation detection
cross-cultural evaluation
πŸ”Ž Similar Papers
No similar papers found.