Re-examining Double Descent and Scaling Laws under Norm-based Capacity via Deterministic Equivalence

๐Ÿ“… 2025-02-03
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
The paper challenges the conventional paradigm of measuring model complexity solely by parameter count, focusing instead on the fundamental nature of the double-descent phenomenon. Method: Leveraging random matrix theory and the deterministic equivalent method, the authors rigorously analyze linear and random-feature models, adopting weight normโ€”not parameter countโ€”as the core complexity measure. Contribution/Results: They prove that double descent persists under this norm-based capacity definition; derive precise concentration limits for the weight norm; and establish a quantitative relationship between weight norm and generalization error. This yields the first weight-norm-driven scaling law framework, providing a more intrinsic theoretical benchmark for overparameterized learning. The framework reveals a more universal mechanistic link among model complexity, training dynamics, and generalization performance.

Technology Category

Application Category

๐Ÿ“ Abstract
We investigate double descent and scaling laws in terms of weights rather than the number of parameters. Specifically, we analyze linear and random features models using the deterministic equivalence approach from random matrix theory. We precisely characterize how the weights norm concentrate around deterministic quantities and elucidate the relationship between the expected test error and the norm-based capacity (complexity). Our results rigorously answer whether double descent exists under norm-based capacity and reshape the corresponding scaling laws. Moreover, they prompt a rethinking of the data-parameter paradigm - from under-parameterized to over-parameterized regimes - by shifting the focus to norms (weights) rather than parameter count.
Problem

Research questions and friction points this paper is trying to address.

Double Descent Phenomenon
Model Complexity
Generalization Ability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random Matrix Theory
Dual Descent Phenomenon
Norm-based Capacity
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yichen Wang
Department of Computer Science, University of Warwick, UK
Y
Yudong Chen
Department of Computer Sciences, University of Wisconsin-Madison, USA
Fanghui Liu
Fanghui Liu
Assistant Professor, University of Warwick
Foundations of Modern ML