๐ค AI Summary
Diffusion models yield high-fidelity samples but suffer from slow sampling due to excessive function evaluations (NFEs); existing timestep scheduling methods struggle to jointly achieve effectiveness, adaptability, robustness, and efficiency. This paper proposes HSO, a training-free hierarchical scheduling optimization framework. At the global level, HSO initializes the schedule via midpoint error proxy (MEP)-guided search to ensure numerical stability; at the local level, it refines the schedule using a spacing-penalty fitness (SPF) function for fine-grained adaptation. HSO is the first method to simultaneously improve acceleration ratio and practical stability under extremely low NFE (as few as 5). It is solver-agnostic and requires no model fine-tuning. Evaluated on Stable Diffusion v2.1 with the LAION-Aesthetics dataset, HSO completes single-run optimization in under 8 seconds and achieves an FID of 11.94โsubstantially outperforming state-of-the-art alternatives.
๐ Abstract
Diffusion probabilistic models have set a new standard for generative fidelity but are hindered by a slow iterative sampling process. A powerful training-free strategy to accelerate this process is Schedule Optimization, which aims to find an optimal distribution of timesteps for a fixed and small Number of Function Evaluations (NFE) to maximize sample quality. To this end, a successful schedule optimization method must adhere to four core principles: effectiveness, adaptivity, practical robustness, and computational efficiency. However, existing paradigms struggle to satisfy these principles simultaneously, motivating the need for a more advanced solution. To overcome these limitations, we propose the Hierarchical-Schedule-Optimizer (HSO), a novel and efficient bi-level optimization framework. HSO reframes the search for a globally optimal schedule into a more tractable problem by iteratively alternating between two synergistic levels: an upper-level global search for an optimal initialization strategy and a lower-level local optimization for schedule refinement. This process is guided by two key innovations: the Midpoint Error Proxy (MEP), a solver-agnostic and numerically stable objective for effective local optimization, and the Spacing-Penalized Fitness (SPF) function, which ensures practical robustness by penalizing pathologically close timesteps. Extensive experiments show that HSO sets a new state-of-the-art for training-free sampling in the extremely low-NFE regime. For instance, with an NFE of just 5, HSO achieves a remarkable FID of 11.94 on LAION-Aesthetics with Stable Diffusion v2.1. Crucially, this level of performance is attained not through costly retraining, but with a one-time optimization cost of less than 8 seconds, presenting a highly practical and efficient paradigm for diffusion model acceleration.