Gradient-Aligned Calibration for Post-Training Quantization of Diffusion Models

📅 2026-02-01

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Diffusion models face significant challenges in deployment due to slow inference speeds and high memory and computational costs. Existing post-training quantization methods typically apply uniform calibration weights across all timesteps, neglecting the distinct gradient and activation distributions at each timestep, which leads to performance degradation. To address this, this work proposes a timestep-aware post-training quantization method that dynamically learns optimal calibration weights for each timestep to align the gradient directions of the quantized model and mitigate gradient conflicts. By introducing, for the first time, a timestep-adaptive weighting mechanism, the proposed approach substantially outperforms current quantization schemes on CIFAR-10, LSUN-Bedrooms, and ImageNet, achieving both higher generation quality and improved inference efficiency.

Technology Category

Application Category

📝 Abstract

Diffusion models have shown remarkable performance in image synthesis by progressively estimating a smooth transition from a Gaussian distribution of noise to a real image. Unfortunately, their practical deployment is limited by slow inference speed, high memory usage, and the computational demands of the noise estimation process. Post-training quantization (PTQ) emerges as a promising solution to accelerate sampling and reduce memory overhead for diffusion models. Existing PTQ methods for diffusion models typically apply uniform weights to calibration samples across timesteps, which is sub-optimal since data at different timesteps may contribute differently to the diffusion process. Additionally, due to varying activation distributions and gradients across timesteps, a uniform quantization approach is sub-optimal. Each timestep requires a different gradient direction for optimal quantization, and treating them equally can lead to conflicting gradients that degrade performance. In this paper, we propose a novel PTQ method that addresses these challenges by assigning appropriate weights to calibration samples. Specifically, our approach learns to assign optimal weights to calibration samples to align the quantized model's gradients across timesteps, facilitating the quantization process. Extensive experiments on CIFAR-10, LSUN-Bedrooms, and ImageNet demonstrate the superiority of our method compared to other PTQ methods for diffusion models.

Problem

Research questions and friction points this paper is trying to address.

post-training quantization

diffusion models

calibration

gradient alignment

timestep-wise quantization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Post-Training Quantization

Diffusion Models

Gradient Alignment

Timestep-Aware Calibration

Quantization

🔎 Similar Papers

Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models

2024-01-09arXiv.orgCitations: 14

Temporal Feature Matters: A Framework for Diffusion Model Quantization

2024-07-28arXiv.orgCitations: 0

Authors to Follow