🤖 AI Summary
Existing distribution-matching distillation methods suffer from diversity collapse and fidelity degradation when reduced to two or fewer sampling steps. This work proposes 1.x-Distill, the first fractional-step distillation framework that transcends the integer-step limitation and enables efficient 1.x-step distillation for the first time. The approach integrates an improved classifier-free guidance (CFG), a staged focus-distillation strategy, a lightweight compensation module, and a co-optimized Distill--Cache mechanism, jointly preserving inference consistency while balancing sample diversity and quality. Evaluated on SD3-Medium and SD3.5-Large, the method achieves superior performance with only 1.67 and 1.74 effective NFEs, respectively, outperforming existing approaches and accelerating inference by up to 33× compared to the original 28×2 NFE sampling.
📝 Abstract
Diffusion models produce high-quality text-to-image results, but their iterative denoising is computationally expensive.Distribution Matching Distillation (DMD) emerges as a promising path to few-step distillation, but suffers from diversity collapse and fidelity degradation when reduced to two steps or fewer. We present 1.x-Distill, the first fractional-step distillation framework that breaks the integer-step constraint of prior few-step methods and establishes 1.x-step generation as a practical regime for distilled diffusion models.Specifically, we first analyze the overlooked role of teacher CFG in DMD and introduce a simple yet effective modification to suppress mode collapse. Then, to improve performance under extreme steps, we introduce Stagewise Focused Distillation, a two-stage strategy that learns coarse structure through diversity-preserving distribution matching and refines details with inference-consistent adversarial distillation. Furthermore, we design a lightweight compensation module for Distill--Cache co-Training, which naturally incorporates block-level caching into our distillation pipeline.Experiments on SD3-Medium and SD3.5-Large show that 1.x-Distill surpasses prior few-step methods, achieving better quality and diversity at 1.67 and 1.74 effective NFEs, respectively, with up to 33x speedup over original 28x2 NFE sampling.