🤖 AI Summary
This study investigates how disparities in class structure and sampling imbalance in real-world data lead to heterogeneous learning dynamics across categories in diffusion models. By constructing a high-dimensional analytical framework that integrates Gaussian mixture random feature models with covariance spectrum analysis, the work characterizes generalization and memorization behaviors of score-based diffusion models for individual classes. Theoretical analysis reveals a hierarchical mechanism governing learning order: class variance dominates, followed by centroid geometry. Notably, under strong imbalance, minority classes exhibit delayed yet distinct specialization during reverse diffusion. These predictions are validated through U-Net experiments on Fashion-MNIST, demonstrating that diffusion models tend to overfit certain categories while neglecting others, thereby offering a novel perspective on their class-dependent learning behavior.
📝 Abstract
Real-world datasets are inherently heterogeneous, yet how per-class structural differences and sampling imbalance shape the training dynamics of diffusion models-and potentially exacerbate disparities-remains poorly understood. While models typically transition from an initial phase of generalization to memorizing the training set, existing theory assumes homogeneous data, leaving open how class imbalance and heterogeneity reshape these dynamics. In this work, we develop a high-dimensional analytical framework to study class-dependent learning in score-based diffusion models. Analyzing a random-features model trained on Gaussian mixtures, we derive the feature-covariance spectrum to characterize per-class generalization and memorization times. We reveal the explicit hierarchy governing these dynamics: class variance is the primary determinant of learning order-consistently favoring higher-variance classes-while centroid geometry plays a secondary role. Sampling imbalance acts as a modulator that can reverse this ordering and, under strong imbalance, forces minority classes to acquire distinct, delayed speciation times during backward diffusion. Together, these results suggest that diffusion models can memorize some classes while others remain insufficiently learned. We validate our theoretical predictions empirically using U-Net models trained on Fashion MNIST.