Learning Beyond the Gaussian Data: Learning Dynamics of Neural Networks on an Expressive and Cumulant-Controllable Data Model

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This work proposes the first controllable non-Gaussian data generation model based on Hermite polynomial expansion, enabling precise manipulation of higher-order cumulants such as skewness and kurtosis. Challenging the conventional assumption in neural network theory that data follow a Gaussian distribution—and thereby neglecting higher-order statistical properties—the study conducts online learning experiments on two-layer networks. The findings reveal a hierarchical learning dynamics: networks first acquire information about the mean and covariance structure before progressively capturing higher-order statistical features. This mechanism is corroborated through pretraining-finetuning experiments on Fashion-MNIST, demonstrating the model’s capacity to bridge the gap between simplified theoretical assumptions and the complexity of real-world data distributions. The approach establishes a new paradigm for investigating how distributional characteristics shape learning dynamics in neural networks.

Technology Category

Application Category

📝 Abstract

We study the effect of high-order statistics of data on the learning dynamics of neural networks (NNs) by using a moment-controllable non-Gaussian data model. Considering the expressivity of two-layer neural networks, we first construct the data model as a generative two-layer NN where the activation function is expanded by using Hermite polynomials. This allows us to achieve interpretable control over high-order cumulants such as skewness and kurtosis through the Hermite coefficients while keeping the data model realistic. Using samples generated from the data model, we perform controlled online learning experiments with a two-layer NN. Our results reveal a moment-wise progression in training: networks first capture low-order statistics such as mean and covariance, and progressively learn high-order cumulants. Finally, we pretrain the generative model on the Fashion-MNIST dataset and leverage the generated samples for further experiments. The results of these additional experiments confirm our conclusions and show the utility of the data model in a real-world scenario. Overall, our proposed approach bridges simplified data assumptions and practical data complexity, which offers a principled framework for investigating distributional effects in machine learning and signal processing.

Problem

Research questions and friction points this paper is trying to address.

high-order statistics

learning dynamics

non-Gaussian data

cumulants

neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

cumulant-controllable data model

Hermite polynomial expansion

learning dynamics