SD3.5-Flash: Distribution-Guided Distillation of Generative Flows

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of deploying high-fidelity image generation models on consumer-grade devices (e.g., smartphones, desktop CPUs/GPUs), this work proposes a flow-model distillation framework tailored for few-step generation. Our method introduces three key innovations: (1) distribution-guided flow-model distillation, integrating a corrected distillation objective with distribution-matching loss; (2) a “time-step sharing” mechanism to reduce gradient noise and a “step-wise fine-tuning” strategy to enhance text–image alignment; and (3) a lightweight text encoder reconstruction coupled with a dedicated INT4 quantization scheme. The resulting model achieves high-fidelity synthesis in only 4–8 sampling steps, significantly improving inference speed and memory efficiency across diverse edge devices. Extensive experiments demonstrate superior performance over state-of-the-art few-step methods in FID, CLIP-Score, and human preference evaluations. To our knowledge, this is the first approach enabling high-quality, real-time image generation deployment across both mobile and desktop platforms.

Technology Category

Application Category

📝 Abstract
We present SD3.5-Flash, an efficient few-step distillation framework that brings high-quality image generation to accessible consumer devices. Our approach distills computationally prohibitive rectified flow models through a reformulated distribution matching objective tailored specifically for few-step generation. We introduce two key innovations: "timestep sharing" to reduce gradient noise and "split-timestep fine-tuning" to improve prompt alignment. Combined with comprehensive pipeline optimizations like text encoder restructuring and specialized quantization, our system enables both rapid generation and memory-efficient deployment across different hardware configurations. This democratizes access across the full spectrum of devices, from mobile phones to desktop computers. Through extensive evaluation including large-scale user studies, we demonstrate that SD3.5-Flash consistently outperforms existing few-step methods, making advanced generative AI truly accessible for practical deployment.
Problem

Research questions and friction points this paper is trying to address.

Efficient distillation of complex flow models for few-step image generation
Reducing computational requirements for high-quality image generation on consumer devices
Improving prompt alignment and memory efficiency across diverse hardware configurations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distribution matching objective for few-step generation
Timestep sharing to reduce gradient noise
Split-timestep fine-tuning for prompt alignment
🔎 Similar Papers
No similar papers found.