🤖 AI Summary
The ubiquitous avalanche-like dynamics observed in deep neural networks (DNNs) lack a mechanistic explanation.
Method: We develop a stochastic theory of deep information propagation grounded in central-limit-theorem–scale fluctuations, identifying four dimensionless coupling parameters that fully determine the universality class of network dynamics. By tuning these parameters, the system undergoes a continuous transition between logarithmic-potential-well Brownian motion and absorbing free Brownian motion—corresponding to distinct critical phase transitions. Integrating stochastic analysis, Landau’s phase-transition theory, and directed percolation models, we establish a unified framework linking static critical exponents to active cascade dynamics.
Results: Numerical simulations confirm that activation function design precisely controls both the type of dynamical phase transition and avalanche statistics—including power-law exponents—thereby providing, for the first time, a first-principles derivation of the universal origin of collective critical behavior in DNNs.
📝 Abstract
Deep neural networks (DNNs) exhibit crackling-like avalanches whose origin lacks a mechanistic explanation. Here, I derive a stochastic theory of deep information propagation (DIP) by incorporating Central Limit Theorem (CLT)-level fluctuations. Four effective couplings $(r, h, D_1, D_2)$ characterize the dynamics, yielding a Landau description of the static exponents and a Directed Percolation (DP) structure of activity cascades. Tuning the couplings selects between avalanche dynamics generated by a Brownian Motion (BM) in a logarithmic trap and an absorbed free BM, each corresponding to a distinct universality classes. Numerical simulations confirm the theory and demonstrate that activation function design controls the collective dynamics in random DNNs.