HiCache: Training-free Acceleration of Diffusion Models via Hermite Polynomial-based Feature Caching

πŸ“… 2025-08-23
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Diffusion models suffer from low inference efficiency due to iterative sampling; existing feature caching approaches rely on temporal extrapolation but struggle to accurately model complex feature dynamics, often compromising generation quality. To address this, we propose HiCacheβ€”a training-free inference acceleration framework. Leveraging the empirical observation that feature derivatives approximately follow a Gaussian distribution, HiCache employs theoretically optimal Hermite polynomial expansions to model feature evolution. A dual-scale mechanism is introduced to ensure both numerical stability and prediction accuracy. Furthermore, HiCache is specifically adapted to the Diffusion Transformer architecture for efficient feature prediction. Evaluated on FLUX.1-dev, HiCache achieves a 6.24Γ— speedup while surpassing baseline generation quality. Its effectiveness and generalizability are further validated across text-to-image synthesis, video generation, and super-resolution tasks.

Technology Category

Application Category

πŸ“ Abstract
Diffusion models have achieved remarkable success in content generation but suffer from prohibitive computational costs due to iterative sampling. While recent feature caching methods tend to accelerate inference through temporal extrapolation, these methods still suffer from server quality loss due to the failure in modeling the complex dynamics of feature evolution. To solve this problem, this paper presents HiCache, a training-free acceleration framework that fundamentally improves feature prediction by aligning mathematical tools with empirical properties. Our key insight is that feature derivative approximations in Diffusion Transformers exhibit multivariate Gaussian characteristics, motivating the use of Hermite polynomials-the potentially theoretically optimal basis for Gaussian-correlated processes. Besides, We further introduce a dual-scaling mechanism that ensures numerical stability while preserving predictive accuracy. Extensive experiments demonstrate HiCache's superiority: achieving 6.24x speedup on FLUX.1-dev while exceeding baseline quality, maintaining strong performance across text-to-image, video generation, and super-resolution tasks. Core implementation is provided in the appendix, with complete code to be released upon acceptance.
Problem

Research questions and friction points this paper is trying to address.

Accelerates diffusion models' slow iterative sampling process
Improves feature prediction accuracy beyond temporal extrapolation methods
Ensures numerical stability while maintaining generation quality across tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hermite polynomial-based feature caching for diffusion models
Dual-scaling mechanism ensures numerical stability
Training-free acceleration framework with Gaussian derivative approximation
πŸ”Ž Similar Papers
No similar papers found.
L
Liang Feng
Shanghai Jiao Tong University
S
Shikang Zheng
Shanghai Jiao Tong University
J
Jiacheng Liu
Shanghai Jiao Tong University
Yuqi Lin
Yuqi Lin
Zhejiang University
Computer VisionMultimodal Foundation Model
Q
Qinming Zhou
Shanghai Jiao Tong University
P
Peiliang Cai
Shanghai Jiao Tong University
X
Xinyu Wang
Shanghai Jiao Tong University
J
Junjie Chen
Shanghai Jiao Tong University
Chang Zou
Chang Zou
Intern at EPIC Lab, Shanghai Jiao Tong University
Generative modelsImages and Videos generation
Yue Ma
Yue Ma
Bytedance
NLPDialogue SystemLLM
Linfeng Zhang
Linfeng Zhang
DP Technology; AI for Science Institute
AI for Sciencemulti-scale modelingmolecular simulationdrug/materials design