Two-Point Deterministic Equivalence for Stochastic Gradient Dynamics in Linear Models

📅 2025-02-07

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work establishes a two-point deterministic equivalent for analytic functions of random matrices to uniformly characterize the asymptotic generalization performance of high-dimensional linear models—including linear regression, kernel regression, and random feature models—under stochastic gradient descent (SGD) training. To this end, we introduce, for the first time, a deterministic equivalent theory based on the two-point resolvent function, overcoming the limitations of conventional single-point equivalents by integrating random matrix theory, Stieltjes transforms, and SGD dynamical modeling. The framework yields explicit asymptotic expressions for generalization error, recovering existing results while extending to novel settings such as non-isotropic data and general step-size schedules. Theoretical predictions align closely with numerical experiments across diverse model configurations. This provides a unified, principled tool for exact asymptotic analysis of high-dimensional non-convex optimization problems driven by SGD.

Technology Category

Application Category

📝 Abstract

We derive a novel deterministic equivalence for the two-point function of a random matrix resolvent. Using this result, we give a unified derivation of the performance of a wide variety of high-dimensional linear models trained with stochastic gradient descent. This includes high-dimensional linear regression, kernel regression, and random feature models. Our results include previously known asymptotics as well as novel ones.

Problem

Research questions and friction points this paper is trying to address.

Derive deterministic equivalence for random matrix

Unify performance analysis of linear models

Include high-dimensional regression and kernel models

Innovation

Methods, ideas, or system contributions that make the work stand out.

deterministic equivalence derivation

stochastic gradient dynamics analysis

high-dimensional linear models unification

🔎 Similar Papers

Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning