🤖 AI Summary
The geometric mechanisms underlying the generalization capability of diffusion models remain poorly understood—particularly whether their success under the manifold hypothesis stems from adaptive modeling of the data’s intrinsic low-dimensional structure.
Method: Starting from score matching, we theoretically establish that smoothing the score function is equivalent to performing geometrically adaptive smoothing along the true data manifold in the log-density domain, inducing implicit tangential regularization. We further propose a tunable smoothing strategy that explicitly controls the manifold scale governing generalization.
Contribution/Results: Through theoretical analysis, implicit regularization characterization, and numerical experiments, we demonstrate that this geometric adaptivity significantly enhances generalization performance. Our work provides a novel paradigm for understanding the implicit regularization inherent in diffusion models, revealing how they automatically emphasize tangential directions on the manifold while suppressing normal perturbations.
📝 Abstract
Diffusion models have achieved state-of-the-art performance, demonstrating remarkable generalisation capabilities across diverse domains. However, the mechanisms underpinning these strong capabilities remain only partially understood. A leading conjecture, based on the manifold hypothesis, attributes this success to their ability to adapt to low-dimensional geometric structure within the data. This work provides evidence for this conjecture, focusing on how such phenomena could result from the formulation of the learning problem through score matching. We inspect the role of implicit regularisation by investigating the effect of smoothing minimisers of the empirical score matching objective. Our theoretical and empirical results confirm that smoothing the score function -- or equivalently, smoothing in the log-density domain -- produces smoothing tangential to the data manifold. In addition, we show that the manifold along which the diffusion model generalises can be controlled by choosing an appropriate smoothing.