Sharpness-Aware Data Generation for Zero-shot Quantization

📅 2025-10-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Zero-shot quantization aims to directly compress pre-trained full-precision models into low-bit representations without access to original training data. Existing approaches predominantly rely on synthetic data generation but overlook parameter-space sharpness—a critical determinant of post-quantization generalization. This paper is the first to explicitly incorporate sharpness minimization into synthetic data generation for zero-shot quantization. We propose a gradient-matching-based sharpness regularization mechanism, wherein gradients computed on neighborhoods of generated samples approximate those on the unseen validation set—enabling data-free generalization guidance. Our method jointly optimizes reconstruction loss and sharpness-aware quantization within a unified zero-shot framework. Extensive experiments on CIFAR-100 and ImageNet demonstrate substantial improvements over state-of-the-art methods, particularly at ultra-low bit-widths (2–4 bits), yielding higher accuracy and enhanced robustness of quantized models.

Technology Category

Application Category

📝 Abstract
Zero-shot quantization aims to learn a quantized model from a pre-trained full-precision model with no access to original real training data. The common idea in zero-shot quantization approaches is to generate synthetic data for quantizing the full-precision model. While it is well-known that deep neural networks with low sharpness have better generalization ability, none of the previous zero-shot quantization works considers the sharpness of the quantized model as a criterion for generating training data. This paper introduces a novel methodology that takes into account quantized model sharpness in synthetic data generation to enhance generalization. Specifically, we first demonstrate that sharpness minimization can be attained by maximizing gradient matching between the reconstruction loss gradients computed on synthetic and real validation data, under certain assumptions. We then circumvent the problem of the gradient matching without real validation set by approximating it with the gradient matching between each generated sample and its neighbors. Experimental evaluations on CIFAR-100 and ImageNet datasets demonstrate the superiority of the proposed method over the state-of-the-art techniques in low-bit quantization settings.
Problem

Research questions and friction points this paper is trying to address.

Generating synthetic data for zero-shot quantization without real training data
Incorporating model sharpness as criterion for better generalization in quantization
Approximating gradient matching without real validation set through neighbor comparisons
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates synthetic data considering quantized model sharpness
Maximizes gradient matching for sharpness minimization
Approximates gradient matching using generated sample neighbors
🔎 Similar Papers
No similar papers found.
D
Dung Hoang-Anh
Department of Data Science and AI, Monash University, Melbourne, Australia
C
Cuong Pham Trung Le
Department of Data Science and AI, Monash University, Melbourne, Australia
Jianfei Cai
Jianfei Cai
Professor of Data Science & AI, Monash University
Visual computingmultimediacomputer visionmultimedia networking
Thanh-Toan Do
Thanh-Toan Do
Senior Lecturer, Monash University
Computer VisionRobotic VisionMachine Learning