The Amazing Stability of Flow Matching

📅 2026-04-17

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This study systematically investigates the impact of dataset size, model architecture, and training configuration on the performance of flow-matching generative models. Through perturbation experiments on the CelebA-HQ dataset—including data pruning (up to 50%), architectural substitutions, and adjustments to training settings—the work quantitatively demonstrates, for the first time, the remarkable stability of flow-matching models in terms of sample quality, diversity, and latent representations. Notably, even with substantial reductions in training data, these models retain high-quality and diverse generation capabilities, while producing highly consistent outputs under identical random seeds. These findings provide crucial empirical evidence supporting the robustness of flow-matching approaches.

Technology Category

Application Category

📝 Abstract

The success of deep generative models in generating high-quality and diverse samples is often attributed to particular architectures and large training datasets. In this paper, we investigate the impact of these factors on the quality and diversity of samples generated by \emph{flow-matching} models. Surprisingly, in our experiments on CelebA-HQ dataset, flow matching remains stable even when pruning 50\% of the dataset. That is, the quality and diversity of generated samples are preserved. Moreover, pruning impacts the latent representation only slightly, that is, samples generated by models trained on the full and pruned dataset map to visually similar outputs for a given seed. We observe similar stability when changing the architecture or training configuration, such that the latent representation is maintained under these changes as well. Our results quantify just how strong this stability can be in practice, and help explain the reliability of flow-matching models under various perturbations.

Problem

Research questions and friction points this paper is trying to address.

flow matching

dataset pruning

model stability

latent representation

generative models

Innovation

Methods, ideas, or system contributions that make the work stand out.

flow matching

dataset pruning

latent stability