🤖 AI Summary
Can a unified theoretical framework explain learning mechanisms in both deep neural networks and biological brains?
Method: Leveraging nonequilibrium statistical physics, we investigate the shared dynamical regime—quasi-criticality—situated between absorbing and active phases. We employ neural avalanche analysis, finite-size scaling, and Barkhausen noise modeling to characterize critical dynamics across artificial and biological neural systems.
Contribution/Results: We demonstrate that crackling noise scaling laws approximately hold in the quasi-critical regime; maximum susceptibility—not the critical point itself—serves as a superior predictor of generalization performance and distinguishes distinct universality classes (e.g., directed percolation). Experiments span diverse weight initialization schemes in deep network training. This work establishes the first physically grounded, unified framework bridging artificial and biological neural networks, offering a novel paradigm for designing high-performance learning systems based on fundamental physical principles.
📝 Abstract
Deep neural networks and brains both learn and share superficial similarities: processing nodes are likened to neurons and adjustable weights are likened to modifiable synapses. But can a unified theoretical framework be found to underlie them both? Here we show that the equations used to describe neuronal avalanches in living brains can also be applied to cascades of activity in deep neural networks. These equations are derived from non-equilibrium statistical physics and show that deep neural networks learn best when poised between absorbing and active phases. Because these networks are strongly driven by inputs, however, they do not operate at a true critical point but within a quasi-critical regime -- one that still approximately satisfies crackling noise scaling relations. By training networks with different initializations, we show that maximal susceptibility is a more reliable predictor of learning than proximity to the critical point itself. This provides a blueprint for engineering improved network performance. Finally, using finite-size scaling we identify distinct universality classes, including Barkhausen noise and directed percolation. This theoretical framework demonstrates that universal features are shared by both biological and artificial neural networks.