LLMs as Cultural Archives: Cultural Commonsense Knowledge Graph Extraction

๐Ÿ“… 2026-01-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work proposes a prompt-based iterative framework that treats large language models (LLMs) as multilingual cultural archives to systematically extract culture-specific entities, relations, and practices. By leveraging prompt engineering, iterative extraction, and multi-hop reasoning chains, the approach constructs a structured cultural commonsense knowledge graph and enables cross-lingual alignment of cultural knowledge. Experiments across five national cultures reveal both the inherent representational imbalances in LLMs and the utility of the resulting graph: it significantly enhances cultural reasoning and story generation capabilities in smaller models, with English-based reasoning chains yielding the strongest performance. This study demonstrates, for the first time, how implicit cultural knowledge embedded in LLMs can be externalized, structured, and effectively repurposed for downstream cultural-aware applications.

Technology Category

Application Category

๐Ÿ“ Abstract
Large language models (LLMs) encode rich cultural knowledge learned from diverse web-scale data, offering an unprecedented opportunity to model cultural commonsense at scale. Yet this knowledge remains mostly implicit and unstructured, limiting its interpretability and use. We present an iterative, prompt-based framework for constructing a Cultural Commonsense Knowledge Graph (CCKG) that treats LLMs as cultural archives, systematically eliciting culture-specific entities, relations, and practices and composing them into multi-step inferential chains across languages. We evaluate CCKG on five countries with human judgments of cultural relevance, correctness, and path coherence. We find that the cultural knowledge graphs are better realized in English, even when the target culture is non-English (e.g., Chinese, Indonesian, Arabic), indicating uneven cultural encoding in current LLMs. Augmenting smaller LLMs with CCKG improves performance on cultural reasoning and story generation, with the largest gains from English chains. Our results show both the promise and limits of LLMs as cultural technologies and that chain-structured cultural knowledge is a practical substrate for culturally grounded NLP.
Problem

Research questions and friction points this paper is trying to address.

cultural commonsense
knowledge graph extraction
large language models
cultural representation
structured knowledge
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cultural Commonsense Knowledge Graph
LLMs as Cultural Archives
Prompt-based Knowledge Extraction
Cross-lingual Cultural Reasoning
Chain-structured Knowledge
๐Ÿ”Ž Similar Papers
No similar papers found.