🤖 AI Summary
Large language models (LLMs) face significant challenges in cultural value alignment, including entrenched biases, homogeneous cultural representations, and degradation of factual knowledge. Prior approaches rely predominantly on the World Values Survey (WVS), which risks cultural flattening and fails to support fine-grained, task-specific cross-cultural understanding. This work is the first to systematically expose the representational limitations of WVS as a sole cultural source. We propose a “narrative-augmented” paradigm that integrates encyclopedic knowledge from Wikipedia with situated cultural narratives from NormAd, enabling a multi-source alignment training framework. Experiments demonstrate substantial improvements in cross-cultural question answering and value judgment tasks: cultural discriminability increases by 37% (p < 0.01), while factual consistency is preserved. Our approach achieves synergistic optimization of cultural distinctiveness and task adaptability.
📝 Abstract
Adapting cultural values in Large Language Models (LLMs) presents significant challenges, particularly due to biases and limited training data. Prior work primarily aligns LLMs with different cultural values using World Values Survey (WVS) data. However, it remains unclear whether this approach effectively captures cultural nuances or produces distinct cultural representations for various downstream tasks. In this paper, we systematically investigate WVS-based training for cultural value adaptation and find that relying solely on survey data can homogenize cultural norms and interfere with factual knowledge. To investigate these issues, we augment WVS with encyclopedic and scenario-based cultural narratives from Wikipedia and NormAd. While these narratives may have variable effects on downstream tasks, they consistently improve cultural distinctiveness than survey data alone. Our work highlights the inherent complexity of aligning cultural values with the goal of guiding task-specific behavior.