There Will Be a Scientific Theory of Deep Learning

๐Ÿ“… 2026-04-23
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

260K/year
๐Ÿค– AI Summary
Deep learning currently lacks a systematic scientific theory that explains its training dynamics, representational structures, and performance patterns. To address this gap, this work proposes a theoretical framework termed โ€œlearning mechanics,โ€ which integrates five methodological pillars: analytically tractable idealized models, solvable limiting analyses, macroscopic statistical laws, hyperparameter decoupling, and cross-system universality. This framework establishes a falsifiable, quantitative, and broadly applicable paradigm for deep learning theory. It not only synthesizes existing theoretical advances and responds to key criticisms but also delineates critical open questions, thereby advancing deep learning from an empirical practice toward a mature scientific discipline grounded in rigorous theoretical foundations.

Technology Category

Application Category

๐Ÿ“ Abstract
In this paper, we make the case that a scientific theory of deep learning is emerging. By this we mean a theory which characterizes important properties and statistics of the training process, hidden representations, final weights, and performance of neural networks. We pull together major strands of ongoing research in deep learning theory and identify five growing bodies of work that point toward such a theory: (a) solvable idealized settings that provide intuition for learning dynamics in realistic systems; (b) tractable limits that reveal insights into fundamental learning phenomena; (c) simple mathematical laws that capture important macroscopic observables; (d) theories of hyperparameters that disentangle them from the rest of the training process, leaving simpler systems behind; and (e) universal behaviors shared across systems and settings which clarify which phenomena call for explanation. Taken together, these bodies of work share certain broad traits: they are concerned with the dynamics of the training process; they primarily seek to describe coarse aggregate statistics; and they emphasize falsifiable quantitative predictions. We argue that the emerging theory is best thought of as a mechanics of the learning process, and suggest the name learning mechanics. We discuss the relationship between this mechanics perspective and other approaches for building a theory of deep learning, including the statistical and information-theoretic perspectives. In particular, we anticipate a symbiotic relationship between learning mechanics and mechanistic interpretability. We also review and address common arguments that fundamental theory will not be possible or is not important. We conclude with a portrait of important open directions in learning mechanics and advice for beginners. We host further introductory materials, perspectives, and open questions at learningmechanics.pub.
Problem

Research questions and friction points this paper is trying to address.

scientific theory
deep learning
learning mechanics
training dynamics
universal behaviors
Innovation

Methods, ideas, or system contributions that make the work stand out.

learning mechanics
deep learning theory
universal behavior
tractable limits
macroscopic observables
๐Ÿ”Ž Similar Papers
No similar papers found.