🤖 AI Summary
This paper addresses the problem of cultural value bias in large language models (LLMs), where default value assumptions often reflect dominant cultural groups, leading to cross-cultural misalignment and harm. To mitigate this, we propose CLCA—the first cultural learning–based alignment framework—designed to explicitly model social interactional intent and extract implicit cultural norms during fine-tuning. CLCA integrates role-play–driven cultural simulation and role-conditioned dialogue generation, enabling LLMs to dynamically acquire diverse implicit cultural norms through interactive learning. Unlike static prompting or conventional supervised fine-tuning, CLCA is architecture-agnostic and supports adaptation across multiple LLM families. We further construct a quantitative cultural value evaluation framework grounded in the World Values Survey. Experimental results demonstrate that CLCA significantly improves cross-cultural value alignment and response appropriateness. This work establishes a novel paradigm for enhancing LLMs’ value plasticity and cultural inclusivity.
📝 Abstract
Adapting large language models (LLMs) to diverse cultural values is a challenging task, as existing LLMs often reflect the values of specific groups by default, and potentially causing harm to others. In this paper, we present CLCA, a novel framework for enhancing LLM alignment with cultural values based on cultural learning. The framework leverages simulated social interactions to generate conversations in which LLMs engage in role-playing within culturally adapted social scenarios, capturing implicit cultural norms for model fine-tuning. CLCA improves cultural value alignment across various model architectures measured using World Value Survey data, demonstrating the effectiveness of our proposed approach. Our results provide early evidence that understanding intent and social interactions can enhance cultural value adaptation in LLMs, highlighting the promise of training approaches based on cultural learning.