LLMs as Cultural Archives: Cultural Commonsense Knowledge Graph Extraction

📅 2026-01-25

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work proposes a prompt-based iterative framework that treats large language models (LLMs) as multilingual cultural archives to systematically extract culture-specific entities, relations, and practices. By leveraging prompt engineering, iterative extraction, and multi-hop reasoning chains, the approach constructs a structured cultural commonsense knowledge graph and enables cross-lingual alignment of cultural knowledge. Experiments across five national cultures reveal both the inherent representational imbalances in LLMs and the utility of the resulting graph: it significantly enhances cultural reasoning and story generation capabilities in smaller models, with English-based reasoning chains yielding the strongest performance. This study demonstrates, for the first time, how implicit cultural knowledge embedded in LLMs can be externalized, structured, and effectively repurposed for downstream cultural-aware applications.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) encode rich cultural knowledge learned from diverse web-scale data, offering an unprecedented opportunity to model cultural commonsense at scale. Yet this knowledge remains mostly implicit and unstructured, limiting its interpretability and use. We present an iterative, prompt-based framework for constructing a Cultural Commonsense Knowledge Graph (CCKG) that treats LLMs as cultural archives, systematically eliciting culture-specific entities, relations, and practices and composing them into multi-step inferential chains across languages. We evaluate CCKG on five countries with human judgments of cultural relevance, correctness, and path coherence. We find that the cultural knowledge graphs are better realized in English, even when the target culture is non-English (e.g., Chinese, Indonesian, Arabic), indicating uneven cultural encoding in current LLMs. Augmenting smaller LLMs with CCKG improves performance on cultural reasoning and story generation, with the largest gains from English chains. Our results show both the promise and limits of LLMs as cultural technologies and that chain-structured cultural knowledge is a practical substrate for culturally grounded NLP.

Problem

Research questions and friction points this paper is trying to address.

cultural commonsense

knowledge graph extraction

large language models

cultural representation

structured knowledge

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cultural Commonsense Knowledge Graph

LLMs as Cultural Archives

Prompt-based Knowledge Extraction