Gaussian Universality in Neural Network Dynamics with Generalized Structured Input Distributions

📅 2024-05-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A foundational assumption in theoretical analyses of deep neural networks—namely, that input distributions can be treated as Gaussian—remains unverified, particularly for structured non-Gaussian inputs such as Gaussian mixtures. Method: The authors develop a continuous-time stochastic differential equation model of SGD dynamics, complemented by rigorous theoretical analysis and large-scale empirical validation across diverse structured input distributions. Contribution/Results: They demonstrate, for the first time, that after appropriate input standardization, parameter evolution under non-Gaussian structured inputs closely matches that under Gaussian inputs—revealing strong universality. Based on this insight, they propose a novel standardization scheme and construct the first unified theoretical framework bridging the gap between idealized Gaussian assumptions and real-world data distributions. This significantly enhances both the explanatory power and predictive accuracy of existing theoretical models in practical settings.

Technology Category

Application Category

📝 Abstract
Bridging the gap between the practical performance of deep learning and its theoretical foundations often involves analyzing neural networks through stochastic gradient descent (SGD). Expanding on previous research that focused on modeling structured inputs under a simple Gaussian setting, we analyze the behavior of a deep learning system trained on inputs modeled as Gaussian mixtures to better simulate more general structured inputs. Through empirical analysis and theoretical investigation, we demonstrate that under certain standardization schemes, the deep learning model converges toward Gaussian setting behavior, even when the input data follow more complex or real-world distributions. This finding exhibits a form of universality in which diverse structured distributions yield results consistent with Gaussian assumptions, which can support the theoretical understanding of deep learning models.
Problem

Research questions and friction points this paper is trying to address.

Analyzing neural networks with Gaussian mixture input distributions
Demonstrating convergence to Gaussian behavior in complex inputs
Exploring universality in deep learning model performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing neural networks via stochastic gradient descent
Modeling inputs as Gaussian mixtures for generality
Demonstrating universality with Gaussian convergence behavior
🔎 Similar Papers
No similar papers found.
J
Jaeyong Bae
Department of Physics, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
Hawoong Jeong
Hawoong Jeong
Professor of Physics, KAIST
Complex SystemsStatistical PhysicsNetwork ScienceData ScienceArtificial Intelligence