🤖 AI Summary
This study reveals implicit cultural biases in GPT-4’s cross-cultural social norm reasoning—specifically, its tendency to generate overgeneralized, low-specificity norms and its susceptibility to stable activation of stereotypical associations via minimal prompting.
Method: Moving beyond conventional values-based surveys, we introduce a novel “bottom-up,” narrative-driven evaluation paradigm grounded in multicultural scenario cases, comprehension-oriented prompt engineering, and a mixed qualitative–quantitative bias detection framework.
Contribution/Results: We demonstrate that while GPT-4 avoids overtly discriminatory language, it perpetuates cultural bias through concealment rather than elimination. Its normative outputs systematically attenuate geographical and group-specific nuance, exhibiting statistically robust, reproducible bias patterns. This work establishes both a methodological innovation—a context-sensitive, narrative-based assessment protocol—and an empirical benchmark for evaluating cultural fairness in large language models.
📝 Abstract
LLMs have been demonstrated to align with the values of Western or North American cultures. Prior work predominantly showed this effect through leveraging surveys that directly ask (originally people and now also LLMs) about their values. However, it is hard to believe that LLMs would consistently apply those values in real-world scenarios. To address that, we take a bottom-up approach, asking LLMs to reason about cultural norms in narratives from different cultures. We find that GPT-4 tends to generate norms that, while not necessarily incorrect, are significantly less culture-specific. In addition, while it avoids overtly generating stereotypes, the stereotypical representations of certain cultures are merely hidden rather than suppressed in the model, and such stereotypes can be easily recovered. Addressing these challenges is a crucial step towards developing LLMs that fairly serve their diverse user base.