Think or Remember? Detecting and Directing LLMs Towards Memorization or Generalization

📅 2024-12-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the neural functional specialization between memory storage and generalization reasoning in large language models (LLMs). To this end, we train small-scale LLMs on a controllable synthetic dataset and, for the first time, identify spatially segregated neuronal populations supporting memory versus generalization at the single-neuron level. Leveraging representational separability modeling and fine-grained neuron activation analysis, we achieve high-accuracy behavioral classification (>92%). Furthermore, we propose a gradient-guided, inference-time targeted intervention technique that significantly enhances target behavior selection (+38%) without parameter modification. Our core contribution is the empirical demonstration—grounded in causal neuroscientific evidence—that LLMs exhibit detectable, predictable, and intervenable functional neural differentiation. This work establishes the first mechanistic framework for probing and modulating cognitive behaviors in LLMs at the neural circuit level, bridging interpretability research with causal cognitive neuroscience.

Technology Category

Application Category

📝 Abstract
In this paper, we explore the foundational mechanisms of memorization and generalization in Large Language Models (LLMs), inspired by the functional specialization observed in the human brain. Our investigation serves as a case study leveraging specially designed datasets and experimental-scale LLMs to lay the groundwork for understanding these behaviors. Specifically, we aim to first enable LLMs to exhibit both memorization and generalization by training with the designed dataset, then (a) examine whether LLMs exhibit neuron-level spatial differentiation for memorization and generalization, (b) predict these behaviors using model internal representations, and (c) steer the behaviors through inference-time interventions. Our findings reveal that neuron-wise differentiation of memorization and generalization is observable in LLMs, and targeted interventions can successfully direct their behavior.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Information Processing
Model Behavior Adjustment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Division of Labor
Memory and Reasoning
Behavior Modulation
Y
Yi-Fu Fu
National Taiwan University
Y
Yu-Chieh Tu
National Taiwan University
T
Tzu-Ling Cheng
National Taiwan University
C
Cheng-Yu Lin
National Taiwan University
Y
Yi-Ting Yang
National Taiwan University
H
Heng-Yi Liu
National Taiwan University
K
Keng-Te Liao
National Taiwan University
Da-Cheng Juan
Da-Cheng Juan
Google Research, National Tsing Hua University, Carnegie Mellon University
Machine LearningData MiningEnergy-Efficient Computing
Shou-De Lin
Shou-De Lin
National Taiwan University
AImachine learningnatural language processing