Isolating Culture Neurons in Multilingual Large Language Models

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This study investigates the neural encoding mechanisms of culture-specific knowledge in multilingual large language models (LLMs), specifically addressing the distinction, localization, and interaction between culture-specific and language-specific neurons. Method: We propose a neuron localization framework based on causal attribution and cross-layer intervention, and introduce MUREL—the first large-scale, multiculturally annotated evaluation benchmark—enabling precise isolation of culture neurons. Contribution/Results: Empirical analysis reveals that cultural representations are predominantly encoded in higher transformer layers; distinct cultures occupy separable, independently controllable neural subspaces; and cultural and linguistic representations exhibit both statistical and causal decomposability. These findings establish a foundation for fine-grained, non-invasive editing of cultural biases, thereby significantly improving cross-cultural fairness and value alignment in LLMs.

Technology Category

Application Category

📝 Abstract

Language and culture are deeply intertwined, yet it is so far unclear how and where multilingual large language models encode culture. Here, we extend upon an established methodology for identifying language-specific neurons and extend it to localize and isolate culture-specific neurons, carefully disentangling their overlap and interaction with language-specific neurons. To facilitate our experiments, we introduce MUREL, a curated dataset of 85.2 million tokens spanning six different cultures. Our localization and intervention experiments show that LLMs encode different cultures in distinct neuron populations, predominantly in upper layers, and that these culture neurons can be modulated independently from language-specific neurons or those specific to other cultures. These findings suggest that cultural knowledge and propensities in multilingual language models can be selectively isolated and edited - promoting fairness, inclusivity, and alignment. Code and data is available at https://github.com/namazifard/Culture_Neurons .

Problem

Research questions and friction points this paper is trying to address.

Identify how multilingual LLMs encode cultural information

Disentangle culture-specific neurons from language-specific ones

Enable selective editing of cultural knowledge in LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extend methodology to isolate culture-specific neurons

Introduce MUREL dataset for culture experiments

Modulate culture neurons independently in upper layers

🔎 Similar Papers

Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs