๐ค AI Summary
The paper challenges the conventional paradigm of measuring model complexity solely by parameter count, focusing instead on the fundamental nature of the double-descent phenomenon. Method: Leveraging random matrix theory and the deterministic equivalent method, the authors rigorously analyze linear and random-feature models, adopting weight normโnot parameter countโas the core complexity measure. Contribution/Results: They prove that double descent persists under this norm-based capacity definition; derive precise concentration limits for the weight norm; and establish a quantitative relationship between weight norm and generalization error. This yields the first weight-norm-driven scaling law framework, providing a more intrinsic theoretical benchmark for overparameterized learning. The framework reveals a more universal mechanistic link among model complexity, training dynamics, and generalization performance.
๐ Abstract
We investigate double descent and scaling laws in terms of weights rather than the number of parameters. Specifically, we analyze linear and random features models using the deterministic equivalence approach from random matrix theory. We precisely characterize how the weights norm concentrate around deterministic quantities and elucidate the relationship between the expected test error and the norm-based capacity (complexity). Our results rigorously answer whether double descent exists under norm-based capacity and reshape the corresponding scaling laws. Moreover, they prompt a rethinking of the data-parameter paradigm - from under-parameterized to over-parameterized regimes - by shifting the focus to norms (weights) rather than parameter count.