Probing Cultural Awareness in LLMs: A Case Study of Cross-Culture Aesthetic Stylistics

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the underexplored capacity of large language models to comprehend and generate aesthetic stylistic expressions in cross-cultural contexts, particularly their ability to employ culturally resonant linguistic strategies. Focusing on cross-cultural stylistic variations in film, television, and advertising texts from Hong Kong and Mainland China, this work introduces C4STYLI, a high-quality bilingual benchmark dataset, and proposes a dual evaluation framework that integrates both style recognition and generation. Through structural ablation studies and logistic regression probing analyses, the research reveals that models predominantly rely on surface-level linguistic features and lack deep structural understanding—especially in recognizing Hong Kong–specific stylistic conventions. A significant discrepancy is observed between model performance in recognition versus generation tasks, and both diverge markedly from human judgments, highlighting critical limitations in current models’ ability to capture cross-cultural stylistic nuances.
📝 Abstract
Large Language Models (LLMs) are increasingly deployed in diverse cultural contexts, yet their ability to master aesthetic stylistics, i.e., the strategic use of language to evoke cultural resonance, remains underexplored. We curate C4STYLI, a benchmark of highly stylized translated movie titles and advertising slogans from Hong Kong and the Chinese Mainland, to evaluate LLMs via the lens of behavioral recognition and productive competence. Extensive evaluations show that LLMs differ from humans in stylistic recognition, and this recognition ability varies across text domains. In addition, stylistic recognition and generation performance in LLMs are not consistently aligned. To further examine whether LLMs genuinely capture stylistic information in stylistic recognition, we conduct structural ablation with logistic regression probes. We find that, in the Hong Kong setting, stylistic recognition in LLMs relies primarily on surface-level linguistic information rather than stylistic structure. This suggests limited sensitivity to Hong Kong-specific stylistic structure.
Problem

Research questions and friction points this paper is trying to address.

cultural awareness
aesthetic stylistics
large language models
cross-culture
stylistic recognition
Innovation

Methods, ideas, or system contributions that make the work stand out.

cultural awareness
aesthetic stylistics
cross-cultural NLP
structural probing
LLM benchmarking