🤖 AI Summary
This work addresses the challenge of generating high-quality images with text-to-image diffusion models under severely limited sampling steps, a setting where existing training-free acceleration methods lack systematic comparison and compatibility analysis. By leveraging the Frenet-Serret formulas from differential geometry, this study is the first to reveal the geometric properties of diffusion trajectories and identifies sampling time scheduling as a critical factor governing generation quality. Building on this insight, the authors propose the Total Rotation Uniform Scheduling (TORS) strategy, a training-free approach that achieves high-fidelity image synthesis in as few as 10 sampling steps on both Flux.1-Dev and Stable Diffusion 3.5. TORS demonstrates strong generalization across novel architectures, hyperparameters, and downstream tasks.
📝 Abstract
Text-to-image diffusion models have achieved unprecedented success but still struggle to produce high-quality results under limited sampling budgets. Existing training-free sampling acceleration methods are typically developed independently, leaving the overall performance and compatibility among these methods unexplored. In this paper, we bridge this gap by systematically elucidating the design space, and our comprehensive experiments identify the sampling time schedule as the most pivotal factor. Inspired by the geometric properties of diffusion models revealed through the Frenet-Serret formulas, we propose constant total rotation schedule (TORS), a scheduling strategy that ensures uniform geometric variation along the sampling trajectory. TORS outperforms previous training-free acceleration methods and produces high-quality images with 10 sampling steps on Flux.1-Dev and Stable Diffusion 3.5. Extensive experiments underscore the adaptability of our method to unseen models, hyperparameters, and downstream applications.