🤖 AI Summary
To address the challenge of deploying high-fidelity image generation models on consumer-grade devices (e.g., smartphones, desktop CPUs/GPUs), this work proposes a flow-model distillation framework tailored for few-step generation. Our method introduces three key innovations: (1) distribution-guided flow-model distillation, integrating a corrected distillation objective with distribution-matching loss; (2) a “time-step sharing” mechanism to reduce gradient noise and a “step-wise fine-tuning” strategy to enhance text–image alignment; and (3) a lightweight text encoder reconstruction coupled with a dedicated INT4 quantization scheme. The resulting model achieves high-fidelity synthesis in only 4–8 sampling steps, significantly improving inference speed and memory efficiency across diverse edge devices. Extensive experiments demonstrate superior performance over state-of-the-art few-step methods in FID, CLIP-Score, and human preference evaluations. To our knowledge, this is the first approach enabling high-quality, real-time image generation deployment across both mobile and desktop platforms.
📝 Abstract
We present SD3.5-Flash, an efficient few-step distillation framework that brings high-quality image generation to accessible consumer devices. Our approach distills computationally prohibitive rectified flow models through a reformulated distribution matching objective tailored specifically for few-step generation. We introduce two key innovations: "timestep sharing" to reduce gradient noise and "split-timestep fine-tuning" to improve prompt alignment. Combined with comprehensive pipeline optimizations like text encoder restructuring and specialized quantization, our system enables both rapid generation and memory-efficient deployment across different hardware configurations. This democratizes access across the full spectrum of devices, from mobile phones to desktop computers. Through extensive evaluation including large-scale user studies, we demonstrate that SD3.5-Flash consistently outperforms existing few-step methods, making advanced generative AI truly accessible for practical deployment.