🤖 AI Summary
This study addresses the limitations of mainstream large language models in aligning with public values across Asia’s multireligious contexts, a challenge rooted in their reliance on English-centric training data. Such models often overlook minority perspectives and reinforce stereotypes, particularly on religious matters. To systematically audit cultural alignment, this work introduces religion as an analytical lens and conducts a multilingual evaluation spanning South and Southeast Asia. Leveraging multilingual prompts, demographic steering, logits distribution analysis, and region-specific bias benchmarks—including CrowS-Pairs and IndiBias—the study assesses the consistency between model internal representations and actual public attitudes in India, East Asia, and Southeast Asia. Findings reveal persistent systemic biases in religious discourse; while lightweight interventions offer partial mitigation, models still exhibit significant gaps in representativeness and fairness within sensitive contextual settings.
📝 Abstract
Large Language Models (LLMs) are increasingly being deployed in multilingual, multicultural settings, yet their reliance on predominantly English-centric training data risks misalignment with the diverse cultural values of different societies. In this paper, we present a comprehensive, multilingual audit of the cultural alignment of contemporary LLMs including GPT-4o-Mini, Gemini-2.5-Flash, Llama 3.2, Mistral and Gemma 3 across India, East Asia and Southeast Asia. Our study specifically focuses on the sensitive domain of religion as the prism for broader alignment. To facilitate this, we conduct a multi-faceted analysis of every LLM's internal representations, using log-probs/logits, to compare the model's opinion distributions against ground-truth public attitudes. We find that while the popular models generally align with public opinion on broad social issues, they consistently fail to accurately represent religious viewpoints, especially those of minority groups, often amplifying negative stereotypes. Lightweight interventions, such as demographic priming and native language prompting, partially mitigate but do not eliminate these cultural gaps. We further show that downstream evaluations on bias benchmarks (such as CrowS-Pairs, IndiBias, ThaiCLI, KoBBQ) reveal persistent harms and under-representation in sensitive contexts. Our findings underscore the urgent need for systematic, regionally grounded audits to ensure equitable global deployment of LLMs.