AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models

📅 2025-10-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In post-training quantization of diffusion models, quantization errors accumulate across denoising steps, severely degrading generation fidelity. To address this, we propose the first sampling-aware joint quantization framework that explicitly models cross-step error propagation. Our method introduces a multi-step output alignment objective for joint calibration across denoising iterations and incorporates an O(1)-memory gradient optimization strategy—eliminating the O(n) storage overhead of conventional approaches. The entire quantization process is performed via end-to-end post-training, guided by full-precision model outputs. Extensive experiments across diverse diffusion architectures (e.g., DDPM, DDIM, LDM) and benchmarks (e.g., CIFAR-10, CelebA-HQ, LSUN) demonstrate that our approach significantly outperforms stepwise independent quantization: it achieves state-of-the-art compression efficiency (e.g., 4-bit weights/activations) while preserving high-fidelity image generation quality.

Technology Category

Application Category

📝 Abstract
We present in this paper a novel post-training quantization (PTQ) method, dubbed AccuQuant, for diffusion models. We show analytically and empirically that quantization errors for diffusion models are accumulated over denoising steps in a sampling process. To alleviate the error accumulation problem, AccuQuant minimizes the discrepancies between outputs of a full-precision diffusion model and its quantized version within a couple of denoising steps. That is, it simulates multiple denoising steps of a diffusion sampling process explicitly for quantization, accounting the accumulated errors over multiple denoising steps, which is in contrast to previous approaches to imitating a training process of diffusion models, namely, minimizing the discrepancies independently for each step. We also present an efficient implementation technique for AccuQuant, together with a novel objective, which reduces a memory complexity significantly from $mathcal{O}(n)$ to $mathcal{O}(1)$, where $n$ is the number of denoising steps. We demonstrate the efficacy and efficiency of AccuQuant across various tasks and diffusion models on standard benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Reduces quantization error accumulation in diffusion models
Simulates multiple denoising steps for accurate quantization
Minimizes output discrepancies between full-precision and quantized models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulates multiple denoising steps for quantization
Minimizes discrepancies between full-precision and quantized models
Reduces memory complexity from O(n) to O(1)
🔎 Similar Papers
No similar papers found.
S
Seunghoon Lee
School of Electrical and Electronic Engineering, Yonsei University
J
Jeongwoo Choi
School of Electrical and Electronic Engineering, Yonsei University
B
Byunggwan Son
School of Electrical and Electronic Engineering, Yonsei University
J
Jaehyeon Moon
School of Electrical and Electronic Engineering, Yonsei University
J
Jeimin Jeon
School of Electrical and Electronic Engineering, Yonsei University
Bumsub Ham
Bumsub Ham
Yonsei University
Computer visionImage processing