Tortoise and Hare Guidance: Accelerating Diffusion Model Inference with Multirate Integration

📅 2025-11-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models suffer from slow inference due to high computational cost in iterative sampling. Method: We propose a training-free acceleration framework by reformulating the classifier-free guidance (CFG)-induced ordinary differential equation as a multi-rate system, decoupling noise estimation and guidance integration—marking the first application of multi-rate integration to diffusion sampling. Leveraging theoretical error analysis revealing the robustness and redundancy of the guidance branch, we design adaptive step sizing, dynamic guidance scale scheduling, and coarse-fine dual-grid integration. Results: Experiments demonstrate up to 30% reduction in function evaluations (NFE) while preserving near-lossless generation quality (ΔImageReward ≤ 0.032), outperforming existing acceleration methods and enabling real-time high-fidelity image synthesis.

Technology Category

Application Category

📝 Abstract
In this paper, we propose Tortoise and Hare Guidance (THG), a training-free strategy that accelerates diffusion sampling while maintaining high-fidelity generation. We demonstrate that the noise estimate and the additional guidance term exhibit markedly different sensitivity to numerical error by reformulating the classifier-free guidance (CFG) ODE as a multirate system of ODEs. Our error-bound analysis shows that the additional guidance branch is more robust to approximation, revealing substantial redundancy that conventional solvers fail to exploit. Building on this insight, THG significantly reduces the computation of the additional guidance: the noise estimate is integrated with the tortoise equation on the original, fine-grained timestep grid, while the additional guidance is integrated with the hare equation only on a coarse grid. We also introduce (i) an error-bound-aware timestep sampler that adaptively selects step sizes and (ii) a guidance-scale scheduler that stabilizes large extrapolation spans. THG reduces the number of function evaluations (NFE) by up to 30% with virtually no loss in generation fidelity ($Delta$ImageReward $leq$ 0.032) and outperforms state-of-the-art CFG-based training-free accelerators under identical computation budgets. Our findings highlight the potential of multirate formulations for diffusion solvers, paving the way for real-time high-quality image synthesis without any model retraining. The source code is available at https://github.com/yhlee-add/THG.
Problem

Research questions and friction points this paper is trying to address.

Accelerating diffusion model inference while maintaining high-fidelity generation quality
Reducing computational redundancy in classifier-free guidance through multirate integration
Enabling real-time high-quality image synthesis without requiring model retraining
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multirate integration separates guidance and noise estimation
Coarse grid integration reduces additional guidance computation
Error-bound-aware timestep sampler adaptively selects step sizes