InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models

πŸ“… 2025-11-28
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Diffusion models suffer from slow inference due to iterative sampling, and existing acceleration methods often compromise fidelity. To address this, we propose a training-free inference acceleration framework that leverages an empirically discovered property: feature relative invariance across timesteps and network layers. Our method introduces a deterministic trajectory extraction mechanism to construct binary cache matrices, enabling joint module-level and full-step-level caching. We further design a quantile-based change metric to dynamically identify cacheable regions and integrate resampling-based correction to preserve reconstruction accuracy. Evaluated on DiT and FLUX architectures, our approach achieves 2–3Γ— end-to-end speedup with negligible degradation in generation qualityβ€”no perceptible visual artifacts are observed. This significantly enhances the practical deployability of diffusion models without retraining or architectural modification.

Technology Category

Application Category

πŸ“ Abstract
Diffusion models deliver high-fidelity synthesis but remain slow due to iterative sampling. We empirically observe there exists feature invariance in deterministic sampling, and present InvarDiff, a training-free acceleration method that exploits the relative temporal invariance across timestep-scale and layer-scale. From a few deterministic runs, we compute a per-timestep, per-layer, per-module binary cache plan matrix and use a re-sampling correction to avoid drift when consecutive caches occur. Using quantile-based change metrics, this matrix specifies which module at which step is reused rather than recomputed. The same invariance criterion is applied at the step scale to enable cross-timestep caching, deciding whether an entire step can reuse cached results. During inference, InvarDiff performs step-first and layer-wise caching guided by this matrix. When applied to DiT and FLUX, our approach reduces redundant compute while preserving fidelity. Experiments show that InvarDiff achieves $2$-$3 imes$ end-to-end speed-ups with minimal impact on standard quality metrics. Qualitatively, we observe almost no degradation in visual quality compared with full computations.
Problem

Research questions and friction points this paper is trying to address.

Accelerates diffusion models by caching invariant features
Reduces redundant computations while preserving fidelity
Enables cross-scale caching without retraining for speedup
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-scale invariance caching for deterministic sampling acceleration
Binary cache plan matrix with re-sampling correction to prevent drift
Step-first and layer-wise caching guided by quantile-based change metrics
πŸ”Ž Similar Papers
No similar papers found.