Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

239K/year

🤖 AI Summary

This study investigates scaling laws and weight-spectrum characteristics of shallow neural networks under feature-learning dynamics, aiming to characterize how sample complexity, weight decay, and other key hyperparameters govern the scaling exponent of excess risk. Methodologically, it innovatively integrates matrix compressed sensing and LASSO techniques into neural scaling analysis, leveraging quadratic/diagonal network models and spectral theory to derive, from first principles, a phase diagram of excess risk. The work fully characterizes phase transitions and plateau phenomena across distinct scaling regimes, reproduces empirically observed crossover scaling behavior, and rigorously establishes a quantitative relationship between the power-law exponent of the weight spectrum’s tail and generalization error. These results provide a novel theoretical framework for understanding generalization in deep learning.

Technology Category

Application Category

📝 Abstract

Neural scaling laws underlie many of the recent advances in deep learning, yet their theoretical understanding remains largely confined to linear models. In this work, we present a systematic analysis of scaling laws for quadratic and diagonal neural networks in the feature learning regime. Leveraging connections with matrix compressed sensing and LASSO, we derive a detailed phase diagram for the scaling exponents of the excess risk as a function of sample complexity and weight decay. This analysis uncovers crossovers between distinct scaling regimes and plateau behaviors, mirroring phenomena widely reported in the empirical neural scaling literature. Furthermore, we establish a precise link between these regimes and the spectral properties of the trained network weights, which we characterize in detail. As a consequence, we provide a theoretical validation of recent empirical observations connecting the emergence of power-law tails in the weight spectrum with network generalization performance, yielding an interpretation from first principles.

Problem

Research questions and friction points this paper is trying to address.

Analyze scaling laws for shallow neural networks in feature learning

Establish link between scaling regimes and weight spectral properties

Provide theoretical validation for power-law tails in weight spectrum

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzed scaling laws for quadratic neural networks

Linked scaling regimes to weight spectral properties

Provided theoretical validation for power-law weight spectra

🔎 Similar Papers

How Feature Learning Can Improve Neural Scaling Laws