Can Persona-Prompted LLMs Emulate Subgroup Values? An Empirical Analysis of Generalisability and Fairness in Cultural Alignment

📅 2026-04-14

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This study addresses the limited cultural adaptability of mainstream large language models, which predominantly align with Western-centric values and struggle to generalize across diverse cultural subgroups. Taking Singapore as a case study, the authors construct a dataset of over 20,000 samples based on the World Values Survey to systematically evaluate and quantify models’ out-of-distribution cultural value generalization. Combining role-based prompting, structured numerical preference fine-tuning, and distance-aware fairness metrics, they find that GPT-4.1 achieves only 57.4% accuracy in predicting modal preferences of unseen subgroups. While fine-tuning improves average out-of-distribution accuracy by 17.4%, it simultaneously exacerbates preference biases toward younger, male, Chinese-ethnic, and Christian subgroups, revealing significant fairness challenges in fine-grained alignment.

Technology Category

Application Category

📝 Abstract

Despite their global prevalence, many Large Language Models (LLMs) are aligned to a monolithic, often Western-centric set of values. This paper investigates the more challenging task of fine-grained value alignment: examining whether LLMs can emulate the distinct cultural values of demographic subgroups. Using Singapore as a case study and the World Values Survey (WVS), we examine the value landscape and show that even state-of-the-art models like GPT-4.1 achieve only 57.4% accuracy in predicting subgroup modal preferences. We construct a dataset of over 20,000 samples to train and evaluate a range of models. We demonstrate that simple fine-tuning on structured numerical preferences yields substantial gains, improving accuracy on unseen, out-of-distribution subgroups by an average of 17.4%. These gains partially transfer to open-ended generation. However, we find significant pre-existing performance biases, where models better emulate young, male, Chinese, and Christian personas. Furthermore, while fine-tuning improves average performance, it widens the disparity between subgroups when measured by distance-aware metrics. Our work offers insights into the limits and fairness implications of subgroup-level cultural alignment.

Problem

Research questions and friction points this paper is trying to address.

value alignment

subgroup values

cultural alignment

fairness

Large Language Models

Innovation

Methods, ideas, or system contributions that make the work stand out.

subgroup value alignment

persona prompting

fine-tuning on numerical preferences