On the Analogy between Human Brain and LLMs: Spotting Key Neurons in Grammar Perception

📅 2025-11-09

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This study investigates whether large language models (LLMs) exhibit syntax-category–specific neural representations analogous to those observed in the human brain. Method: Building on Llama 3, we conduct neuron importance analysis, part-of-speech (POS)-conditioned activation profiling, and train linear POS classifiers to systematically identify neuron subsets highly sensitive to grammatical categories (e.g., nouns, verbs). Contribution/Results: We discover, for the first time in LLMs, an interpretable, structured syntactic subspace: distinct neuron groups show stable, linearly separable activation patterns across POS classes. A linear classifier trained on these activations achieves 92.4% accuracy on held-out data—significantly surpassing baseline models. These findings provide neuroscientifically grounded, mechanistic evidence for syntax-aware internal representations in LLMs and reveal functional specialization within the model’s architecture reminiscent of cortical grammar processing in humans.

Technology Category

Application Category

📝 Abstract

Artificial Neural Networks, the building blocks of AI, were inspired by the human brain's network of neurons. Over the years, these networks have evolved to replicate the complex capabilities of the brain, allowing them to handle tasks such as image and language processing. In the realm of Large Language Models, there has been a keen interest in making the language learning process more akin to that of humans. While neuroscientific research has shown that different grammatical categories are processed by different neurons in the brain, we show that LLMs operate in a similar way. Utilizing Llama 3, we identify the most important neurons associated with the prediction of words belonging to different part-of-speech tags. Using the achieved knowledge, we train a classifier on a dataset, which shows that the activation patterns of these key neurons can reliably predict part-of-speech tags on fresh data. The results suggest the presence of a subspace in LLMs focused on capturing part-of-speech tag concepts, resembling patterns observed in lesion studies of the brain in neuroscience.

Problem

Research questions and friction points this paper is trying to address.

Identifying key neurons in LLMs for grammatical category processing

Comparing neural mechanisms between human brain and language models

Developing classifiers using neuron activations for part-of-speech prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Identifying key neurons for part-of-speech prediction

Training classifier using neuron activation patterns

Discovering grammatical subspace resembling brain processing

🔎 Similar Papers

No similar papers found.