🤖 AI Summary
The absence of a unified theoretical framework for deep learning impedes rigorous understanding of its generalization, implicit bias, and feature learning mechanisms.
Method: This work establishes, for the first time, a systematic mapping between deep learning and statistical field theory: neural network function spaces are modeled as random fields, and training dynamics are formulated as field evolution processes. Leveraging path integrals, renormalization group methods, Gaussian process limits, and continuous-depth approximations, we construct the first physics-inspired, first-principles theoretical framework.
Contribution/Results: The framework unifies explanations of key phenomena—including kernel-like behavior in the infinite-width limit, implicit regularization induced by gradient descent, and phase-transition–like feature learning—within a single analytically tractable and broadly generalizable formalism. It provides explicit, interpretable theoretical tools that advance deep learning from an empirical discipline toward a foundational science grounded in first principles.
📝 Abstract
Deep learning algorithms have made incredible strides in the past decade yet due to the complexity of these algorithms, the science of deep learning remains in its early stages. Being an experimentally driven field, it is natural to seek a theory of deep learning within the physics paradigm. As deep learning is largely about learning functions and distributions over functions, statistical field theory, a rich and versatile toolbox for tackling complex distributions over functions (fields) is an obvious choice of formalism. Research efforts carried out in the past few years have demonstrated the ability of field theory to provide useful insights on generalization, implicit bias, and feature learning effects. Here we provide a pedagogical review of this emerging line of research.