🤖 AI Summary
MeanFlow, a few-step generative modeling framework, suffers from an inherent optimization conflict between trajectory flow matching and trajectory consistency in its objective, leading to slow convergence and training instability. To address this, we propose AlphaFlow—a unified framework that formulates a family of plug-and-play objectives, integrating trajectory flow matching, Shortcut Models, and MeanFlow under a single theoretical paradigm. We further introduce a curriculum learning strategy that progressively schedules the training objective from pure flow matching toward MeanFlow, effectively mitigating the conflict. Implemented via end-to-end training on the DiT architecture, AlphaFlow achieves state-of-the-art performance on ImageNet-256: the AlphaFlow-XL/2+ model attains FID 2.58 with only 1 NFE and FID 2.15 with 2 NFE—marking substantial improvements in both sampling efficiency and generation quality.
📝 Abstract
MeanFlow has recently emerged as a powerful framework for few-step generative modeling trained from scratch, but its success is not yet fully understood. In this work, we show that the MeanFlow objective naturally decomposes into two parts: trajectory flow matching and trajectory consistency. Through gradient analysis, we find that these terms are strongly negatively correlated, causing optimization conflict and slow convergence. Motivated by these insights, we introduce $alpha$-Flow, a broad family of objectives that unifies trajectory flow matching, Shortcut Model, and MeanFlow under one formulation. By adopting a curriculum strategy that smoothly anneals from trajectory flow matching to MeanFlow, $alpha$-Flow disentangles the conflicting objectives, and achieves better convergence. When trained from scratch on class-conditional ImageNet-1K 256x256 with vanilla DiT backbones, $alpha$-Flow consistently outperforms MeanFlow across scales and settings. Our largest $alpha$-Flow-XL/2+ model achieves new state-of-the-art results using vanilla DiT backbones, with FID scores of 2.58 (1-NFE) and 2.15 (2-NFE).