The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units

📅 2024-11-04

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

This study investigates whether large language models (LLMs) harbor brain-like, functionally specialized units for language processing and whether such units exert causal influence on linguistic behavior. Across 18 state-of-the-art LLMs, we employed neuroscience-inspired functional localization to identify language-selective neurons, followed by targeted ablation and alignment with human fMRI language-system activation patterns to establish causality. We provide the first causal evidence of language functional specialization in LLMs: ablating these units significantly impairs language task performance, and their representational geometry closely aligns with that of the human language network. Moreover, certain models exhibit domain-specific specialization beyond language—e.g., in reasoning and social cognition. Our results reveal systematic cross-model variation in the degree of functional specialization, offering a novel paradigm for probing internal model mechanisms and informing brain-inspired architectural design.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) exhibit remarkable capabilities on not just language tasks, but also various tasks that are not linguistic in nature, such as logical reasoning and social inference. In the human brain, neuroscience has identified a core language system that selectively and causally supports language processing. We here ask whether similar specialization for language emerges in LLMs. We identify language-selective units within 18 popular LLMs, using the same localization approach that is used in neuroscience. We then establish the causal role of these units by demonstrating that ablating LLM language-selective units -- but not random units -- leads to drastic deficits in language tasks. Correspondingly, language-selective LLM units are more aligned to brain recordings from the human language system than random units. Finally, we investigate whether our localization method extends to other cognitive domains: while we find specialized networks in some LLMs for reasoning and social capabilities, there are substantial differences among models. These findings provide functional and causal evidence for specialization in large language models, and highlight parallels with the functional organization in the brain.

Problem

Research questions and friction points this paper is trying to address.

Identify language-selective units in LLMs

Establish causal role of these units

Compare LLM specialization with brain organization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neuro-inspired localization in LLMs

Causal ablation of language units

Cross-domain specialization analysis

🔎 Similar Papers

No similar papers found.

Authors to Follow