🤖 AI Summary
Diffusion models suffer from slow inference and high computational overhead. To address this, we propose a gradient-flow-guided iterative pruning method that achieves model compression via progressive soft pruning and gradient-flow-driven sparse-space optimization, without significant degradation in generation quality. Our key contributions are: (1) a novel gradient-flow sensitivity criterion—replacing conventional weight-magnitude-based metrics—for identifying unimportant parameters; (2) a continuous mask-updating mechanism to mitigate performance collapse caused by one-shot hard pruning; and (3) energy-function-guided sparse training to enhance convergence stability of the pruned subnetwork. Evaluated on multiple image generation benchmarks, our method achieves an average 2.1× speedup and 38% energy reduction over state-of-the-art one-shot pruning approaches, with only a marginal FID increase of +0.4—outperforming existing sparse diffusion model methods.
📝 Abstract
Diffusion Models (DMs) have impressive capabilities among generation models, but are limited to slower inference speeds and higher computational costs. Previous works utilize one-shot structure pruning to derive lightweight DMs from pre-trained ones, but this approach often leads to a significant drop in generation quality and may result in the removal of crucial weights. Thus we propose a iterative pruning method based on gradient flow, including the gradient flow pruning process and the gradient flow pruning criterion. We employ a progressive soft pruning strategy to maintain the continuity of the mask matrix and guide it along the gradient flow of the energy function based on the pruning criterion in sparse space, thereby avoiding the sudden information loss typically caused by one-shot pruning. Gradient-flow based criterion prune parameters whose removal increases the gradient norm of loss function and can enable fast convergence for a pruned model in iterative pruning stage. Our extensive experiments on widely used datasets demonstrate that our method achieves superior performance in efficiency and consistency with pre-trained models.