TC-Padé: Trajectory-Consistent Padé Approximation for Diffusion Acceleration

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the degradation in generation quality under low-step (20–30 steps) diffusion sampling, which stems from accumulated feature prediction errors and trajectory drift, making it challenging to balance efficiency and fidelity. To this end, we propose a trajectory-consistent feature prediction framework based on Padé rational function approximation, replacing conventional Taylor expansions. Our method integrates historical residual-driven adaptive coefficient modulation with stage-aware (early/mid/late) dynamic perception to accurately model both asymptotic and transitional behaviors during denoising. This approach substantially enhances trajectory consistency and stability at low step counts, achieving up to 2.88× acceleration on DiT-XL/2, FLUX.1-dev, and Wan2.1 while outperforming existing feature caching techniques across multiple metrics, including FID, CLIP score, Aesthetic score, and VBench-2.0.

Technology Category

Application Category

📝 Abstract
Despite achieving state-of-the-art generation quality, diffusion models are hindered by the substantial computational burden of their iterative sampling process. While feature caching techniques achieve effective acceleration at higher step counts (e.g., 50 steps), they exhibit critical limitations in the practical low-step regime of 20-30 steps. As the interval between steps increases, polynomial-based extrapolators like TaylorSeer suffer from error accumulation and trajectory drift. Meanwhile, conventional caching strategies often overlook the distinct dynamical properties of different denoising phases. To address these challenges, we propose Trajectory-Consistent Padé approximation, a feature prediction framework grounded in Padé approximation. By modeling feature evolution through rational functions, our approach captures asymptotic and transitional behaviors more accurately than Taylor-based methods. To enable stable and trajectory-consistent sampling under reduced step counts, TC-Padé incorporates (1) adaptive coefficient modulation that leverages historical cached residuals to detect subtle trajectory transitions, and (2) step-aware prediction strategies tailored to the distinct dynamics of early, mid, and late sampling stages. Extensive experiments on DiT-XL/2, FLUX.1-dev, and Wan2.1 across both image and video generation demonstrate the effectiveness of TC-Padé. For instance, TC-Padé achieves 2.88x acceleration on FLUX.1-dev and 1.72x on Wan2.1 while maintaining high quality across FID, CLIP, Aesthetic, and VBench-2.0 metrics, substantially outperforming existing feature caching methods.
Problem

Research questions and friction points this paper is trying to address.

diffusion acceleration
trajectory drift
feature caching
low-step sampling
denoising dynamics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pade approximation
diffusion acceleration
trajectory consistency
feature caching
step-aware prediction
🔎 Similar Papers
No similar papers found.
B
Benlei Cui
Alibaba Group
S
Shaoxuan He
Zhejiang University
B
Bukun Huang
Zhejiang Gongshang University
Z
Zhizeng Ye
Zhejiang Gongshang University
Y
Yunyun Sun
Alibaba Group
Longtao Huang
Longtao Huang
Alibaba Group
Knowledge GraphService ComputingData Mining
H
Hui Xue
Alibaba Group
Y
Yang Yang
Alibaba Group
Jingqun Tang
Jingqun Tang
ByteDance Inc.
Computer VisionDocument IntelligenceMLLMMultimodal Generative Models
Zhou Zhao
Zhou Zhao
Zhejiang University
Machine LearningData MiningMultimedia Computing
H
Haiwen Hong
Alibaba Group