🤖 AI Summary
GANs frequently suffer from poor generation quality and inadequate mode coverage on complex manifold data, primarily because the standard adversarial loss cannot eliminate regions of positive Lebesgue measure in the generated manifold that deviate from the true data manifold. This work provides the first theoretical characterization of this fundamental limitation and proposes Score-Matching Regularization (SMaRt), a plug-and-play technique that integrates a gradient-based regularizer—derived from an off-the-shelf pre-trained diffusion model’s approximate score function—into the GAN training objective to continuously pull generated samples toward the true data manifold. SMaRt is architecture-agnostic and compatible with mainstream GAN frameworks. On ImageNet 64×64, it reduces Aurora’s FID from 8.87 to 7.11—a level competitive with state-of-the-art single-step consistency models. Extensive experiments across multiple benchmarks consistently demonstrate improved generation quality and diversity for leading GANs.
📝 Abstract
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex. In this work, we revisit the mathematical foundations of GANs, and theoretically reveal that the native adversarial loss for GAN training is insufficient to fix the problem of subsets with positive Lebesgue measure of the generated data manifold lying out of the real data manifold. Instead, we find that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold. We thereby propose to improve the optimization of GANs with score matching regularity (SMaRt). Regarding the empirical evidences, we first design a toy example to show that training GANs by the aid of a ground-truth score function can help reproduce the real data distribution more accurately, and then confirm that our approach can consistently boost the synthesis performance of various state-of-the-art GANs on real-world datasets with pre-trained diffusion models acting as the approximate score function. For instance, when training Aurora on the ImageNet 64x64 dataset, we manage to improve FID from 8.87 to 7.11, on par with the performance of one-step consistency model. The source code will be made public.