I Am Aligned, But With Whom? MENA Values Benchmark for Evaluating Cultural Alignment and Multilingual Bias in LLMs

📅 2025-10-15

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This study addresses the chronic underrepresentation of Middle Eastern and North African (MENA) populations in AI evaluation. We introduce MENAValues, the first benchmark explicitly designed to assess cultural alignment and multilingual bias in AI systems. Methodologically, we construct a bilingual (English + Arabic/Persian/Turkish), multi-perspective, multi-condition evaluation framework grounded in large-scale human values survey data. Our framework systematically investigates how language translation, reasoning prompting, and internal logit distributions affect cultural adaptation. We empirically identify three novel forms of cultural misalignment: “cross-lingual value shift,” “reasoning-induced degradation,” and “logit leakage”—revealing systemic biases such as the reduction of diverse MENA nations into monolithic stereotypes. The work delivers an extensible, culture-sensitive AI diagnostic framework and open-source tools, advancing global AI evaluation toward culturally inclusive paradigms.

Technology Category

Application Category

📝 Abstract

We introduce MENAValues, a novel benchmark designed to evaluate the cultural alignment and multilingual biases of large language models (LLMs) with respect to the beliefs and values of the Middle East and North Africa (MENA) region, an underrepresented area in current AI evaluation efforts. Drawing from large-scale, authoritative human surveys, we curate a structured dataset that captures the sociocultural landscape of MENA with population-level response distributions from 16 countries. To probe LLM behavior, we evaluate diverse models across multiple conditions formed by crossing three perspective framings (neutral, personalized, and third-person/cultural observer) with two language modes (English and localized native languages: Arabic, Persian, Turkish). Our analysis reveals three critical phenomena: "Cross-Lingual Value Shifts" where identical questions yield drastically different responses based on language, "Reasoning-Induced Degradation" where prompting models to explain their reasoning worsens cultural alignment, and "Logit Leakage" where models refuse sensitive questions while internal probabilities reveal strong hidden preferences. We further demonstrate that models collapse into simplistic linguistic categories when operating in native languages, treating diverse nations as monolithic entities. MENAValues offers a scalable framework for diagnosing cultural misalignment, providing both empirical insights and methodological tools for developing more culturally inclusive AI.

Problem

Research questions and friction points this paper is trying to address.

Evaluating cultural alignment of LLMs with underrepresented MENA region values

Assessing multilingual biases through cross-lingual value shifts in responses

Diagnosing cultural misalignment via reasoning degradation and hidden preferences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel benchmark evaluates cultural alignment in MENA region

Dataset captures sociocultural landscape using human survey data

Framework diagnoses bias via multilingual and perspective variations

🔎 Similar Papers

Self-Alignment: Improving Alignment of Cultural Values in LLMs via In-Context Learning