🤖 AI Summary
This study addresses the tendency of large language models (LLMs) to reproduce stereotypes and cultural homogenization in cross-cultural content generation, often failing to authentically capture cultural specificity. Focusing on recipe generation as a testbed, the work presents the first systematic evaluation of multiple LLMs’ cross-cultural adaptability across varying cultural distances. Leveraging the GlobalFusion dataset, the assessment integrates cultural distance metrics, internal representation analysis, and ingredient provenance tracing. Results reveal no significant correlation between model outputs and cultural distance, minimal retention of cultural information in internal representations, and poor alignment between countries and their emblematic ingredients. These findings expose fundamental limitations in LLMs’ capacity for cultural representation, novelty judgment, and coherent integration of culturally specific elements.
📝 Abstract
Large Language Models (LLMs) are increasingly used to generate and shape cultural content, ranging from narrative writing to artistic production. While these models demonstrate impressive fluency and generative capacity, prior work has shown that they also exhibit systematic cultural biases, raising concerns about stereotyping, homogenization, and the erasure of culturally specific forms of expression. Understanding whether LLMs can meaningfully align with diverse cultures beyond the dominant ones remains a critical challenge. In this paper, we study cultural adaptation in LLMs through the lens of cooking recipes, a domain in which culture, tradition, and creativity are tightly intertwined. We build on the \textit{GlobalFusion} dataset, which pairs human recipes from different countries according to established measures of cultural distance. Using the same country pairs, we generate culturally adapted recipes with multiple LLMs, enabling a direct comparison between human and LLM behavior in cross-cultural content creation. Our analysis shows that LLMs fail to produce culturally representative adaptations. Unlike humans, the divergence of their generated recipes does not correlate with cultural distance. We further provide explanations for this gap. We show that cultural information is weakly preserved in internal model representations, that models inflate novelty in their production by misunderstanding notions such as creativity and tradition, and that they fail to identify adaptation with its associated countries and to ground it in culturally salient elements such as ingredients. These findings highlight fundamental limitations of current LLMs for culturally oriented generation and have important implications for their use in culturally sensitive applications.