Generalization of Diffusion Models Arises with a Balanced Representation Space

šŸ“… 2025-12-24
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
Diffusion models often memorize training data due to over-optimization of the denoising objective, impairing generalization. This work formally distinguishes memorization—characterized by spiky, overfitted representations—from generalization—characterized by balanced, statistically grounded representations—from a representation learning perspective. We theoretically establish that generalization arises from geometric balance in the representation space, not merely from denoising optimization, via analysis of a two-layer ReLU denoising autoencoder. Building on this insight, we propose a training-free representation-guided editing method and an interpretable memory detection criterion, grounded in latent-space geometric diagnostics. We validate our framework on both unconditional and text-to-image diffusion models, demonstrating strong correlation between representation balance and generalization performance, achieving high-accuracy memory identification and fine-grained controllable generation.

Technology Category

Application Category

šŸ“ Abstract
Diffusion models excel at generating high-quality, diverse samples, yet they risk memorizing training data when overfit to the training objective. We analyze the distinctions between memorization and generalization in diffusion models through the lens of representation learning. By investigating a two-layer ReLU denoising autoencoder (DAE), we prove that (i) memorization corresponds to the model storing raw training samples in the learned weights for encoding and decoding, yielding localized "spiky" representations, whereas (ii) generalization arises when the model captures local data statistics, producing "balanced" representations. Furthermore, we validate these theoretical findings on real-world unconditional and text-to-image diffusion models, demonstrating that the same representation structures emerge in deep generative models with significant practical implications. Building on these insights, we propose a representation-based method for detecting memorization and a training-free editing technique that allows precise control via representation steering. Together, our results highlight that learning good representations is central to novel and meaningful generative modeling.
Problem

Research questions and friction points this paper is trying to address.

Analyzes memorization vs generalization in diffusion models
Proves balanced representations enable generalization in denoising autoencoders
Proposes methods to detect memorization and control generation via representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyze memorization vs generalization via representation learning
Propose detection method and training-free editing technique
Validate findings on real-world unconditional and text-to-image models
šŸ”Ž Similar Papers
No similar papers found.