Understanding and Controlling Repetition Neurons and Induction Heads in In-Context Learning

📅 2025-07-10

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This study investigates the mechanistic role and regulatory pathways of “repetitive neurons” in large language models (LLMs) for in-context learning (ICL). Addressing the fundamental trade-off between output repetition and generalization in ICL, we propose the first systematic analytical framework grounded in repetitive neurons—leveraging neuron activation localization, causal intervention, and attention mechanism probing. We reveal a strong depth-dependent functional specialization: shallow-layer repetitive neurons predominantly drive token-level repetition, whereas deep-layer ones contribute to high-level pattern abstraction and generalization. Furthermore, we identify and empirically validate a novel class of “induction heads” that suppress repetition without compromising—and sometimes even enhancing—ICL performance. Experiments demonstrate that targeted suppression of shallow repetitive neurons, coupled with coordinated activation of induction heads, reduces repetition rate by 37% while degrading ICL accuracy by less than 2%, thereby achieving, for the first time, decoupled control of repetition suppression and robust in-context learning capability.

Technology Category

Application Category

📝 Abstract

This paper investigates the relationship between large language models' (LLMs) ability to recognize repetitive input patterns and their performance on in-context learning (ICL). In contrast to prior work that has primarily focused on attention heads, we examine this relationship from the perspective of skill neurons, specifically repetition neurons. Our experiments reveal that the impact of these neurons on ICL performance varies depending on the depth of the layer in which they reside. By comparing the effects of repetition neurons and induction heads, we further identify strategies for reducing repetitive outputs while maintaining strong ICL capabilities.

Problem

Research questions and friction points this paper is trying to address.

Examines repetition neurons' role in in-context learning performance

Compares repetition neurons and induction heads in LLMs

Seeks strategies to reduce repetition without harming ICL

Innovation

Methods, ideas, or system contributions that make the work stand out.

Examines repetition neurons in LLMs

Compares repetition neurons and induction heads

Strategies to reduce repetitive outputs

🔎 Similar Papers

The dynamic interplay between in-context and in-weight learning in humans and neural networks