🤖 AI Summary
This work investigates the learning dynamics of nonlinear phase retrieval under anisotropic Gaussian inputs with power-law covariance spectra. To overcome limitations of conventional two-dimensional reduced frameworks, we propose an analytically tractable reduction method that maps the infinite-dimensional coupled dynamical system onto a closed finite-dimensional one. We rigorously characterize the scaling law for anisotropic inputs—revealing a three-stage evolution: rapid escape, slow convergence, and spectral-tail learning. Leveraging tools from random matrix theory and mean-field analysis, we derive explicit scaling laws for the mean-squared error, explicitly incorporating the decay exponent of the data spectrum. Numerical experiments validate both the predicted phase transitions and convergence rates. Our core contribution is the first rigorous, computationally tractable, and empirically verifiable theoretical framework for nonlinear learning dynamics under power-law anisotropic data.
📝 Abstract
Scaling laws describe how learning performance improves with data, compute, or training time, and have become a central theme in modern deep learning. We study this phenomenon in a canonical nonlinear model: phase retrieval with anisotropic Gaussian inputs whose covariance spectrum follows a power law. Unlike the isotropic case, where dynamics collapse to a two-dimensional system, anisotropy yields a qualitatively new regime in which an infinite hierarchy of coupled equations governs the evolution of the summary statistics. We develop a tractable reduction that reveals a three-phase trajectory: (i) fast escape from low alignment, (ii) slow convergence of the summary statistics, and (iii) spectral-tail learning in low-variance directions. From this decomposition, we derive explicit scaling laws for the mean-squared error, showing how spectral decay dictates convergence times and error curves. Experiments confirm the predicted phases and exponents. These results provide the first rigorous characterization of scaling laws in nonlinear regression with anisotropic data, highlighting how anisotropy reshapes learning dynamics.