SubFlow: Sub-mode Conditioned Flow Matching for Diverse One-Step Generation

📅 2026-04-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

217K/year
🤖 AI Summary
Existing single-step flow matching generative models, trained with mean squared error, often suffer from submode averaging, which suppresses low-density yet valid distributional modes and compromises generation diversity. To address this, this work proposes a submode conditioning mechanism: first, semantic clustering partitions each data class into intrinsic submodes; then, flow matching is performed conditioned on submode indices, encouraging each conditional subdistribution to be approximately unimodal. This approach requires no architectural modifications and can be seamlessly integrated as a plug-and-play solution to eliminate averaging artifacts. Evaluated on ImageNet-256, the method significantly improves generation diversity (measured by Recall) while maintaining excellent image quality (FID), demonstrating its general effectiveness across single-step frameworks such as MeanFlow and Shortcut.

Technology Category

Application Category

📝 Abstract
Flow matching has emerged as a powerful generative framework, with recent few-step methods achieving remarkable inference acceleration. However, we identify a critical yet overlooked limitation: these models suffer from severe diversity degradation, concentrating samples on dominant modes while neglecting rare but valid variations of the target distribution. We trace this degradation to averaging distortion: when trained with MSE objectives, class-conditional flows learn a frequency-weighted mean over intra-class sub-modes, causing the model to over-represent high-density modes while systematically neglecting low-density ones. To address this, we propose SubFlow, Sub-mode Conditioned Flow Matching, which eliminates averaging distortion by decomposing each class into fine-grained sub-modes via semantic clustering and conditioning the flow on sub-mode indices. Each conditioned sub-distribution is approximately unimodal, so the learned flow accurately targets individual modes with no averaging distortion, restoring full mode coverage in a single inference step. Crucially, SubFlow is entirely plug-and-play: it integrates seamlessly into existing one-step models such as MeanFlow and Shortcut Models without any architectural modifications. Extensive experiments on ImageNet-256 demonstrate that SubFlow yields substantial gains in generation diversity (Recall) while maintaining competitive image quality (FID), confirming its broad applicability across different one-step generation frameworks. Project page: https://yexionglin.github.io/subflow.
Problem

Research questions and friction points this paper is trying to address.

flow matching
generation diversity
mode collapse
one-step generation
sub-mode representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sub-mode Conditioning
Flow Matching
One-Step Generation
Diversity Preservation
Averaging Distortion
🔎 Similar Papers
No similar papers found.