Why are all LLMs Obsessed with Japanese Culture? On the Hidden Cultural and Regional Biases of LLMs

📅 2026-04-23
📈 Citations: 0
Influential: 0
📄 PDF

career value

178K/year
🤖 AI Summary
This study addresses the implicit regional preferences of large language models (LLMs) in culturally grounded contexts, revealing a pronounced inclination toward Japanese culture that challenges the prevailing assumption of exclusive Western bias. The authors introduce the Culturally Relevant Open-ended Questions (CROQ) dataset and conduct a systematic evaluation through multilingual prompting, cultural theme classification, output diversity analysis, and ablation studies across training stages. Their findings indicate that cultural biases predominantly emerge during supervised fine-tuning rather than pretraining. Furthermore, querying in high-resource languages such as English enhances response richness and mitigates overemphasis on the cultural context associated with the input language. This work is the first to identify and characterize non-Western cultural preferences in LLMs and precisely locates their origin within the model training pipeline.

Technology Category

Application Category

📝 Abstract
LLMs have been showing limitations when it comes to cultural coverage and competence, and in some cases show regional biases such as amplifying Western and Anglocentric viewpoints. While there have been works analysing the cultural capabilities of LLMs, there has not been specific work on highlighting LLM regional preferences when it comes to cultural-related questions. In this work, we propose a new dataset based on a comprehensive taxonomy of Culture-Related Open Questions (CROQ). The results show that, contrary to previous cultural bias work, LLMs show a clear tendency towards countries such as Japan. Moveover, our results show that when prompting in languages such as English or other high-resource ones, LLMs tend to provide more diverse outputs and show less inclinations towards answering questions highlighting countries for which the input language is an official language. Finally, we also investigate at which point of LLM training this cultural bias emerges, with our results suggesting that the first clear signs appear after supervised fine-tuning, and not during pre-training.
Problem

Research questions and friction points this paper is trying to address.

cultural bias
regional bias
large language models
Japan preference
CROQ
Innovation

Methods, ideas, or system contributions that make the work stand out.

cultural bias
large language models
regional preference
CROQ dataset
supervised fine-tuning