Spectral Progressive Diffusion for Efficient Image and Video Generation

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the computational redundancy in conventional diffusion models for image and video generation, which arises from processing high-frequency noise at full resolution during early denoising steps. The authors propose a frequency-domain autoregressive generation framework that progressively increases resolution along the denoising trajectory via a spectral noise expansion mechanism and a power spectrum–guided optimal resolution scheduling strategy. This approach defers high-cost computations on noise-dominated components to later stages, thereby improving efficiency without sacrificing fidelity. Notably, the method can accelerate existing pretrained diffusion models without retraining and is complemented by a novel fine-tuning strategy to further enhance generation quality. Experimental results demonstrate significant gains in inference efficiency while maintaining high visual fidelity.

📝 Abstract

Diffusion models have been shown to implicitly generate visual content autoregressively in the frequency domain, where low-frequency components are generated earlier in the denoising process while high-frequency details emerge only in later timesteps. This structure offers a natural opportunity for efficient generation, as high-resolution computation on noise-dominated frequencies is largely redundant. We propose Spectral Progressive Diffusion, a general framework that progressively grows resolution along the denoising trajectory of pretrained diffusion models. To this end, we develop a spectral noise expansion mechanism and derive an optimal resolution schedule from the model's power spectrum. Our framework supports training-free acceleration and a novel fine-tuning recipe that further improves efficiency and quality. We demonstrate significant speedups on state-of-the-art pretrained image and video generation models while preserving visual quality.

Problem

Research questions and friction points this paper is trying to address.

diffusion models

efficient generation

frequency domain

resolution scheduling

spectral redundancy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Spectral Progressive Diffusion

frequency-domain generation

resolution scheduling