Score Augmentation for Diffusion Models

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models suffer from overfitting in few-shot learning scenarios. To address this, we propose ScoreAug—the first method to introduce data augmentation into the noise space—by defining “score augmentation”: applying transformations to noisy samples and requiring the denoiser to predict the corresponding augmented clean targets, thereby establishing an equivariant learning objective. Leveraging theoretical analysis, ScoreAug derives precise mapping relationships among score functions under various transformations, effectively mitigating overfitting while avoiding data leakage inherent in conventional pixel-space augmentations. Extensive experiments on CIFAR-10, FFHQ, AFHQv2, and ImageNet demonstrate that ScoreAug significantly improves generalization and training stability, enables more robust convergence, and seamlessly integrates with existing diffusion training techniques—yielding consistent performance gains without architectural modifications.

Technology Category

Application Category

📝 Abstract
Diffusion models have achieved remarkable success in generative modeling. However, this study confirms the existence of overfitting in diffusion model training, particularly in data-limited regimes. To address this challenge, we propose Score Augmentation (ScoreAug), a novel data augmentation framework specifically designed for diffusion models. Unlike conventional augmentation approaches that operate on clean data, ScoreAug applies transformations to noisy data, aligning with the inherent denoising mechanism of diffusion. Crucially, ScoreAug further requires the denoiser to predict the augmentation of the original target. This design establishes an equivariant learning objective, enabling the denoiser to learn scores across varied denoising spaces, thereby realizing what we term score augmentation. We also theoretically analyze the relationship between scores in different spaces under general transformations. In experiments, we extensively validate ScoreAug on multiple benchmarks including CIFAR-10, FFHQ, AFHQv2, and ImageNet, with results demonstrating significant performance improvements over baselines. Notably, ScoreAug effectively mitigates overfitting across diverse scenarios, such as varying data scales and model capacities, while exhibiting stable convergence properties. Another advantage of ScoreAug over standard data augmentation lies in its ability to circumvent data leakage issues under certain conditions. Furthermore, we show that ScoreAug can be synergistically combined with traditional data augmentation techniques to achieve additional performance gains.
Problem

Research questions and friction points this paper is trying to address.

Addresses overfitting in diffusion model training
Proposes ScoreAug for noisy data augmentation
Enhances performance across multiple benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Applies transformations to noisy data
Equivariant learning objective for denoiser
Synergizes with traditional augmentation techniques