TALES: A Taxonomy and Analysis of Cultural Representations in LLM-generated Stories

📅 2025-11-26

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This study systematically investigates cultural misrepresentation in large language models’ (LLMs) generation of Indian multicultural narratives. To address this, we propose TALES-Tax—the first culturally grounded taxonomy of misrepresentation tailored to the Indian context—and introduce TALES-QA, a multilingual, urban-rural benchmark dataset derived from real-world experiences, evaluated via focus groups, large-scale human annotation, and a question-answering assessment framework. Results reveal that 88% of generated narratives contain cultural inaccuracies, with pronounced biases in low- and medium-resource languages and suburban/rural themes; although LLMs possess relevant cultural knowledge, they fail to reliably retrieve and apply it during generation. This work provides the first quantitative evidence of systematic cultural bias in LLM-generated Indian narratives and establishes a reproducible methodology—alongside publicly available taxonomic, dataset, and evaluation resources—for culturally sensitive generative AI assessment.

Technology Category

Application Category

📝 Abstract

Millions of users across the globe turn to AI chatbots for their creative needs, inviting widespread interest in understanding how such chatbots represent diverse cultures. At the same time, evaluating cultural representations in open-ended tasks remains challenging and underexplored. In this work, we present TALES, an evaluation of cultural misrepresentations in LLM-generated stories for diverse Indian cultural identities. First, we develop TALES-Tax, a taxonomy of cultural misrepresentations by collating insights from participants with lived experiences in India through focus groups (N=9) and individual surveys (N=15). Using TALES-Tax, we evaluate 6 models through a large-scale annotation study spanning 2,925 annotations from 108 annotators with lived cultural experience from across 71 regions in India and 14 languages. Concerningly, we find that 88% of the generated stories contain one or more cultural inaccuracies, and such errors are more prevalent in mid- and low-resourced languages and stories based in peri-urban regions in India. Lastly, we transform the annotations into TALES-QA, a standalone question bank to evaluate the cultural knowledge of foundational models. Through this evaluation, we surprisingly discover that models often possess the requisite cultural knowledge despite generating stories rife with cultural misrepresentations.

Problem

Research questions and friction points this paper is trying to address.

Evaluating cultural misrepresentations in LLM-generated stories for Indian identities

Developing a taxonomy of cultural inaccuracies through focus groups and surveys

Assessing prevalence of cultural errors across languages and geographic regions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed taxonomy of cultural misrepresentations through focus groups

Conducted large-scale annotation study with cultural experts

Created question bank to evaluate model cultural knowledge

🔎 Similar Papers

NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models