Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy

📅 2026-01-06
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of large language models (LLMs) in large-scale counting tasks, where performance degrades due to the depth constraints of the Transformer architecture. Inspired by System-2 cognition, the authors propose a test-time strategy that decomposes complex counting problems into independently solvable subproblems. Through observational and causal mediation analysis, attention head tracing, and representational probing, the study provides the first mechanistic account of System-2-like counting within LLMs, elucidating how counting information is stored, propagated, and aggregated across layers. This approach not only substantially improves counting accuracy on large-scale tasks but also overcomes inherent architectural limitations, offering both strong empirical performance and enhanced interpretability.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs), despite strong performance on complex mathematical problems, exhibit systematic limitations in counting tasks. This issue arises from architectural limits of transformers, where counting is performed across layers, leading to degraded precision for larger counting problems due to depth constraints. To address this limitation, we propose a simple test-time strategy inspired by System-2 cognitive processes that decomposes large counting tasks into smaller, independent sub-problems that the model can reliably solve. We evaluate this approach using observational and causal mediation analyses to understand the underlying mechanism of this System-2-like strategy. Our mechanistic analysis identifies key components: latent counts are computed and stored in the final item representations of each part, transferred to intermediate steps via dedicated attention heads, and aggregated in the final stage to produce the total count. Experimental results demonstrate that this strategy enables LLMs to surpass architectural limitations and achieve high accuracy on large-scale counting tasks. This work provides mechanistic insight into System-2 counting in LLMs and presents a generalizable approach for improving and understanding their reasoning behavior.
Problem

Research questions and friction points this paper is trying to address.

counting
large language models
architectural limitations
transformer depth
reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

mechanistic interpretability
System-2 reasoning
counting in LLMs
attention heads
test-time decomposition
Hosein Hasani
Hosein Hasani
Sharif University of Technology
Machine Learning
M
Mohammadali Banayeeanzade
Sharif University of Technology
A
Ali Nafisi
Sharif University of Technology
S
Sadegh Mohammadian
Sharif University of Technology
F
Fatemeh Askari
Sharif University of Technology
M
Mobin Bagherian
Sharif University of Technology
A
Amirmohammad Izadi
Sharif University of Technology
M
M. Baghshah
Sharif University of Technology