Topological Signatures of Grokking

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This study investigates the internal mechanisms underlying the transition from memorization to generalization—known as “grokking”—in neural network training. Focusing on modular arithmetic tasks, the authors analyze the topological evolution of model embedding point clouds using persistent homology, complemented by Fourier analysis and local intrinsic dimensionality estimates, to systematically compare representational structures under different data regimes. They report the first evidence that grokking coincides with a pronounced increase in both the maximum and total persistence of first-order homology groups, and demonstrate that this topological shift correlates directly with generalization rather than mere memorization. These findings reveal how the cyclic structure inherent in the task is geometrically and topologically encoded in the representation space, offering a unified perspective on generalization in deep learning.

📝 Abstract

We study the grokking phenomenon through the lens of topology. Using persistent homology on point clouds derived from the embedding matrices of a range of models trained on modular arithmetic with varying primes, we identify a clear and consistent topological signature of grokking: a sharp increase in both the maximum and total persistence of first homology ($H_1$). Persistence diagrams reveal the emergence of a dominant long-lived topological feature together with increasingly structured secondary features, reflecting the underlying cyclic structure of the task. Compared to existing spectral and geometric diagnostics -- specifically, Fourier analysis and local intrinsic dimension -- persistent homology provides a unified geometric and topological characterization of representation learning, capturing both local and global multi-scale structure. Ablations across data regimes and control settings show that these topological transitions are tied to generalization rather than memorization. Our results suggest that persistent homology offers a principled and interpretable framework for analyzing how neural networks internalize latent structure during training.

Problem

Research questions and friction points this paper is trying to address.

grokking

topological signatures

persistent homology

representation learning

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

persistent homology

grokking

topological signature