Shortcutting Pre-trained Flow Matching Diffusion Models is Almost Free Lunch

📅 2025-10-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of flow-matching diffusion models, which typically require numerous sampling steps. We propose a post-training compression method that requires no retraining. Our approach centers on an online self-distillation mechanism grounded in the velocity field, integrated with trajectory-skipping learning and step-size-embedding-free lightweight modeling—enabling, for the first time, aggressive step skipping for standard flow-matching models (e.g., Flux). It supports both pretrained fusion and standalone post-training paradigms, and is the first few-shot distillation framework applicable to billion-parameter-scale diffusion models. Using less than one A100 GPU-day, we compress Flux into a 3-step sampler, achieving state-of-the-art generation quality at minimal computational cost. Moreover, the method enables efficient adaptation using only ten text–image pairs.

Technology Category

Application Category

📝 Abstract
We present an ultra-efficient post-training method for shortcutting large-scale pre-trained flow matching diffusion models into efficient few-step samplers, enabled by novel velocity field self-distillation. While shortcutting in flow matching, originally introduced by shortcut models, offers flexible trajectory-skipping capabilities, it requires a specialized step-size embedding incompatible with existing models unless retraining from scratch$unicode{x2013}$a process nearly as costly as pretraining itself. Our key contribution is thus imparting a more aggressive shortcut mechanism to standard flow matching models (e.g., Flux), leveraging a unique distillation principle that obviates the need for step-size embedding. Working on the velocity field rather than sample space and learning rapidly from self-guided distillation in an online manner, our approach trains efficiently, e.g., producing a 3-step Flux less than one A100 day. Beyond distillation, our method can be incorporated into the pretraining stage itself, yielding models that inherently learn efficient, few-step flows without compromising quality. This capability also enables, to our knowledge, the first few-shot distillation method (e.g., 10 text-image pairs) for dozen-billion-parameter diffusion models, delivering state-of-the-art performance at almost free cost.
Problem

Research questions and friction points this paper is trying to address.

Accelerating pre-trained flow matching diffusion models into few-step samplers
Enabling efficient shortcutting without retraining or step-size embeddings
Achieving few-shot distillation for billion-parameter models with minimal cost
Innovation

Methods, ideas, or system contributions that make the work stand out.

Velocity field self-distillation enables shortcutting
Online self-guided distillation trains rapidly
Method integrates into pretraining for few-step flows
🔎 Similar Papers
No similar papers found.
Xu Cai
Xu Cai
The authors conducted this work as independent researchers, pursued as a hobby and without institutional affiliation.
Y
Yang Wu
AI Research Center, iHuman Inc..
Q
Qianli Chen
The authors conducted this work as independent researchers, pursued as a hobby and without institutional affiliation.
H
Haoran Wu
The authors conducted this work as independent researchers, pursued as a hobby and without institutional affiliation.
L
Lichuan Xiang
Department of Computer Science, University of Warwick.
Hongkai Wen
Hongkai Wen
University of Warwick
Machine LearningML/AI SystemsCyber-Physical Systems