Re-examining Double Descent and Scaling Laws under Norm-based Capacity via Deterministic Equivalence

📅 2025-02-03

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

The paper challenges the conventional paradigm of measuring model complexity solely by parameter count, focusing instead on the fundamental nature of the double-descent phenomenon. Method: Leveraging random matrix theory and the deterministic equivalent method, the authors rigorously analyze linear and random-feature models, adopting weight norm—not parameter count—as the core complexity measure. Contribution/Results: They prove that double descent persists under this norm-based capacity definition; derive precise concentration limits for the weight norm; and establish a quantitative relationship between weight norm and generalization error. This yields the first weight-norm-driven scaling law framework, providing a more intrinsic theoretical benchmark for overparameterized learning. The framework reveals a more universal mechanistic link among model complexity, training dynamics, and generalization performance.

Technology Category

Application Category

📝 Abstract

We investigate double descent and scaling laws in terms of weights rather than the number of parameters. Specifically, we analyze linear and random features models using the deterministic equivalence approach from random matrix theory. We precisely characterize how the weights norm concentrate around deterministic quantities and elucidate the relationship between the expected test error and the norm-based capacity (complexity). Our results rigorously answer whether double descent exists under norm-based capacity and reshape the corresponding scaling laws. Moreover, they prompt a rethinking of the data-parameter paradigm - from under-parameterized to over-parameterized regimes - by shifting the focus to norms (weights) rather than parameter count.

Problem

Research questions and friction points this paper is trying to address.

Double Descent Phenomenon

Model Complexity

Generalization Ability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Random Matrix Theory

Dual Descent Phenomenon

Norm-based Capacity

🔎 Similar Papers

No similar papers found.