Topological Invariance and Breakdown in Learning

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the mechanistic role of learning rate in governing topological evolution of neuron manifolds during training. Methodologically, leveraging permutation-equivariant learning rules, we prove that training induces a bi-Lipschitz mapping—strongly constraining the topological structure of the neuron manifold. We identify a universal topological critical learning rate η*, such that training preserves topology (constrained optimization phase) when η < η*, whereas topology simplifies and the manifold becomes coarser—degrading representational capacity—when η > η* (simplification phase). This study introduces, for the first time, the concept of “topological phase transition” in deep learning, independent of specific architectures or loss functions. By unifying topological manifold analysis, gradient dynamics, and margin stability modeling, we establish a generalizable topological analysis framework. Our results provide a novel theoretical foundation for understanding how learning rate governs model expressivity and generalization.

Technology Category

Application Category

📝 Abstract
We prove that for a broad class of permutation-equivariant learning rules (including SGD, Adam, and others), the training process induces a bi-Lipschitz mapping between neurons and strongly constrains the topology of the neuron distribution during training. This result reveals a qualitative difference between small and large learning rates $η$. With a learning rate below a topological critical point $η^*$, the training is constrained to preserve all topological structure of the neurons. In contrast, above $η^*$, the learning process allows for topological simplification, making the neuron manifold progressively coarser and thereby reducing the model's expressivity. Viewed in combination with the recent discovery of the edge of stability phenomenon, the learning dynamics of neuron networks under gradient descent can be divided into two phases: first they undergo smooth optimization under topological constraints, and then enter a second phase where they learn through drastic topological simplifications. A key feature of our theory is that it is independent of specific architectures or loss functions, enabling the universal application of topological methods to the study of deep learning.
Problem

Research questions and friction points this paper is trying to address.

Proving topological invariance in permutation-equivariant learning rules
Revealing critical learning rate for topological preservation
Characterizing two-phase learning dynamics in neural networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training induces bi-Lipschitz mapping between neurons
Learning rate determines topological preservation or simplification
Theory applies universally across architectures and loss functions
🔎 Similar Papers
No similar papers found.
Yongyi Yang
Yongyi Yang
University of Michigan
Machine learningGraph neural networks
T
Tomaso Poggio
Massachusetts Institute of Technology
I
Isaac Chuang
Massachusetts Institute of Technology
L
Liu Ziyin
Massachusetts Institute of Technology, NTT Research