Learning a Single Index Model from Anisotropic Data with vanilla Stochastic Gradient Descent

📅 2025-03-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work studies learning single-index models (SIMs) under anisotropic Gaussian inputs, focusing on the implicit adaptation mechanism of standard stochastic gradient descent (SGD). We theoretically establish that vanilla SGD—without explicit preprocessing or covariance correction—automatically adapts to the input covariance structure and achieves global convergence. To characterize the fundamental impact of anisotropy on learnability, we introduce the notion of *covariance-dominated effective dimension*. Leveraging this concept, we derive tight upper and lower bounds on the sample complexity, which match exactly—revealing an intrinsic unification between statistical and optimization perspectives. This is the first work to prove both global convergence and implicit covariance adaptation of vanilla SGD for SIMs under anisotropic inputs. Our results provide a new theoretical lens into implicit regularization in deep learning, linking geometric properties of data to algorithmic behavior.

Technology Category

Application Category

📝 Abstract
We investigate the problem of learning a Single Index Model (SIM)- a popular model for studying the ability of neural networks to learn features - from anisotropic Gaussian inputs by training a neuron using vanilla Stochastic Gradient Descent (SGD). While the isotropic case has been extensively studied, the anisotropic case has received less attention and the impact of the covariance matrix on the learning dynamics remains unclear. For instance, Mousavi-Hosseini et al. (2023b) proposed a spherical SGD that requires a separate estimation of the data covariance matrix, thereby oversimplifying the influence of covariance. In this study, we analyze the learning dynamics of vanilla SGD under the SIM with anisotropic input data, demonstrating that vanilla SGD automatically adapts to the data's covariance structure. Leveraging these results, we derive upper and lower bounds on the sample complexity using a notion of effective dimension that is determined by the structure of the covariance matrix instead of the input data dimension.
Problem

Research questions and friction points this paper is trying to address.

Study learning Single Index Model with anisotropic data
Analyze vanilla SGD adaptation to covariance structure
Derive sample complexity bounds via effective dimension
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses vanilla SGD for anisotropic data
Adapts automatically to covariance structure
Bounds complexity via effective dimension
🔎 Similar Papers
No similar papers found.