Large language models perceive cities through a culturally uneven baseline

📅 2026-04-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

209K/year
🤖 AI Summary
It remains unclear whether large language models (LLMs) exhibit cultural neutrality in urban perception. This study systematically evaluates LLM-generated urban descriptions using a globally balanced dataset of street-view images paired with both neutral and region-specific cultural prompts, combining open-ended text generation with structured six-dimensional scoring. The outputs are semantically and affectively compared against human benchmarks. Findings reveal that even under neutral prompting, LLMs display a pronounced Eurocentric bias. While culturally proximate prompts improve alignment with local residents’ descriptions, they fail to restore semantic diversity and only partially reproduce human intergroup differences—particularly along dimensions such as perceived safety and aesthetic appeal. This work uncovers embedded cultural biases in LLMs’ urban cognition and provides an empirical foundation for multicultural prompt engineering.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) are increasingly used to describe, evaluate and interpret places, yet it remains unclear whether they do so from a culturally neutral standpoint. Here we test urban perception in frontier LLMs using a balanced global street-view sample and prompts that either remain neutral or invoke different regional cultural standpoints. Across open-ended descriptions and structured place judgments, the neutral condition proved not to be neutral in practice. Prompts associated with Europe and Northern America remained systematically closer to the baseline than many non-Western prompts, indicating that model perception is organized around a culturally uneven reference frame rather than a universal one. Cultural prompting also shifted affective evaluation, producing sentiment-based ingroup preference for some prompted identities. Comparisons with regional human text-image benchmarks showed that culturally proximate prompting could improve alignment with human descriptions, but it did not recover human levels of semantic diversity and often preserved an affectively elevated style. The same asymmetry reappeared in structured judgments of safety, beauty, wealth, liveliness, boredom and depression, where model outputs were interpretable but only partly reproduced human group differences. These findings suggest that LLMs do not simply perceive cities from nowhere: they do so through a culturally uneven baseline that shapes what appears ordinary, familiar and positively valued.
Problem

Research questions and friction points this paper is trying to address.

large language models
urban perception
cultural bias
street-view analysis
affective evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

cultural bias
urban perception
large language models
prompt engineering
cross-cultural alignment