Learning Modular Exponentiation with Transformers

📅 2025-06-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Modular exponentiation (a^b mod m), a fundamental number-theoretic operation, remains understudied in mechanistic interpretability. Method: We train a 4-layer encoder-decoder Transformer using principled data sampling and inverse-operation pairing, enabling systematic analysis of learning dynamics. Contribution/Results: We report the first observation of cross-modulus grokking—sudden generalization beyond training modulus ranges. Through PCA embedding analysis, attention head subgraph identification, and activation patching, we discover that modular exponentiation is implemented by a sparse, modular circuit composed of only a few attention heads. This circuit internally reconstructs shared arithmetic structures—including modular multiplication chains and exponent decomposition—demonstrating structured, interpretable computation rather than black-box fitting. Our findings provide the first fine-grained causal evidence for how neural networks acquire and execute number-theoretic reasoning, establishing modular exponentiation as an emergent, mechanistically explainable capability grounded in identifiable neural circuitry.

Technology Category

Application Category

📝 Abstract
Modular exponentiation is crucial to number theory and cryptography, yet remains largely unexplored from a mechanistic interpretability standpoint. We train a 4-layer encoder-decoder Transformer model to perform this operation and investigate the emergence of numerical reasoning during training. Utilizing principled sampling strategies, PCA-based embedding analysis, and activation patching, we examine how number-theoretic properties are encoded within the model. We find that reciprocal operand training leads to strong performance gains, with sudden generalization across related moduli. These synchronized accuracy surges reflect grokking-like dynamics, suggesting the model internalizes shared arithmetic structure. We also find a subgraph consisting entirely of attention heads in the final layer sufficient to achieve full performance on the task of regular exponentiation. These results suggest that transformer models learn modular arithmetic through specialized computational circuits, paving the way for more interpretable and efficient neural approaches to modular exponentiation.
Problem

Research questions and friction points this paper is trying to address.

Understanding how transformers learn modular exponentiation operations
Investigating numerical reasoning emergence during model training
Identifying specialized circuits for modular arithmetic in transformers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer model learns modular exponentiation
PCA and activation patching analyze number encoding
Specialized attention subgraph achieves full performance
🔎 Similar Papers
David Demitri Africa
David Demitri Africa
UK AI Security Institute (AISI)
Alignment
S
Sara M. Kapoor
Department of Computer Science and Technology, University of Cambridge
T
Theo Simon Sorg
Department of Computer Science and Technology, University of Cambridge