🤖 AI Summary
Diffusion models suffer from high computational overhead due to iterative sampling and quadratic-attention complexity; existing training-free acceleration methods improve speed but severely degrade fidelity. To address this, we propose a novel adaptive acceleration framework grounded in the stability theory of ordinary differential equation (ODE) solvers—the first to incorporate numerical stability criteria into diffusion acceleration. Our method jointly optimizes step size and token sparsity via a gradient-aware dynamic sparsification strategy, adapting to diverse prompts and generation trajectories without fine-tuning. It is solver-agnostic, compatible with mainstream ODE-based samplers (e.g., EDM, DPM++) and multimodal pipelines (e.g., ControlNet, MusicLDM). Evaluated on SD-2, SDXL, and Flux, our approach achieves ≥1.8× speedup while maintaining strong fidelity (LPIPS ≤ 0.10, FID ≤ 4.5), outperforming state-of-the-art methods. Cross-modal experiments—including audio generation—further demonstrate its broad generalizability.
📝 Abstract
Diffusion models have achieved remarkable success in generative tasks but suffer from high computational costs due to their iterative sampling process and quadratic attention costs. Existing training-free acceleration strategies that reduce per-step computation cost, while effectively reducing sampling time, demonstrate low faithfulness compared to the original baseline. We hypothesize that this fidelity gap arises because (a) different prompts correspond to varying denoising trajectory, and (b) such methods do not consider the underlying ODE formulation and its numerical solution. In this paper, we propose Stability-guided Adaptive Diffusion Acceleration (SADA), a novel paradigm that unifies step-wise and token-wise sparsity decisions via a single stability criterion to accelerate sampling of ODE-based generative models (Diffusion and Flow-matching). For (a), SADA adaptively allocates sparsity based on the sampling trajectory. For (b), SADA introduces principled approximation schemes that leverage the precise gradient information from the numerical ODE solver. Comprehensive evaluations on SD-2, SDXL, and Flux using both EDM and DPM++ solvers reveal consistent $ge 1.8 imes$ speedups with minimal fidelity degradation (LPIPS $leq 0.10$ and FID $leq 4.5$) compared to unmodified baselines, significantly outperforming prior methods. Moreover, SADA adapts seamlessly to other pipelines and modalities: It accelerates ControlNet without any modifications and speeds up MusicLDM by $1.8 imes$ with $sim 0.01$ spectrogram LPIPS.