FREPix: Frequency-Heterogeneous Flow Matching for Pixel-Space Image Generation

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

212K/year
🤖 AI Summary
Existing pixel-space image generation methods often overlook the heterogeneity between low- and high-frequency components during synthesis. This work proposes a frequency-heterogeneous flow matching framework that, for the first time in pixel space, explicitly decomposes and models these components separately. By integrating independent transport trajectories, a factorized neural architecture, and a frequency-aware objective function, the approach embeds a coarse-to-fine generation mechanism at its core. This design substantially enhances control over multi-scale structures, achieving state-of-the-art FID scores of 1.91 and 2.38 on class-conditional ImageNet generation at 256×256 and 512×512 resolutions, respectively, with particularly strong performance under low NFE (number of function evaluations) settings.
📝 Abstract
Pixel-space diffusion has re-emerged as a promising alternative to latent-space generation because it avoids the representation bottleneck introduced by VAEs. Yet most existing methods still treat image generation as a frequency-homogeneous process, overlooking the distinct roles and learning dynamics of low- and high-frequency components. To address this, we propose FREPix, a FREquency-heterogeneous flow matching framework for Pixel-space image generation. FREPix explicitly decomposes generation into low- and high-frequency components, assigns them separate transport paths, predicts them with a factorized network, and trains them with a frequency-aware objective. In this way, coarse-to-fine generation becomes an explicit design principle rather than an implicit behavior. On ImageNet class-to-image generation, FREPix achieves competitive results among pixel-space generation models, reaching 1.91 FID at $256\times256$ and 2.38 FID at $512\times512$, with particularly strong behavior in the low-NFE regime.
Problem

Research questions and friction points this paper is trying to address.

pixel-space image generation
frequency heterogeneity
low-frequency components
high-frequency components
image generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

frequency-heterogeneous
flow matching
pixel-space generation
factorized network
frequency-aware objective