🤖 AI Summary
This study systematically investigates the impact of dataset size, model architecture, and training configuration on the performance of flow-matching generative models. Through perturbation experiments on the CelebA-HQ dataset—including data pruning (up to 50%), architectural substitutions, and adjustments to training settings—the work quantitatively demonstrates, for the first time, the remarkable stability of flow-matching models in terms of sample quality, diversity, and latent representations. Notably, even with substantial reductions in training data, these models retain high-quality and diverse generation capabilities, while producing highly consistent outputs under identical random seeds. These findings provide crucial empirical evidence supporting the robustness of flow-matching approaches.
📝 Abstract
The success of deep generative models in generating high-quality and diverse samples is often attributed to particular architectures and large training datasets. In this paper, we investigate the impact of these factors on the quality and diversity of samples generated by \emph{flow-matching} models. Surprisingly, in our experiments on CelebA-HQ dataset, flow matching remains stable even when pruning 50\% of the dataset. That is, the quality and diversity of generated samples are preserved. Moreover, pruning impacts the latent representation only slightly, that is, samples generated by models trained on the full and pruned dataset map to visually similar outputs for a given seed. We observe similar stability when changing the architecture or training configuration, such that the latent representation is maintained under these changes as well.
Our results quantify just how strong this stability can be in practice, and help explain the reliability of flow-matching models under various perturbations.