SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning

πŸ“… 2026-03-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenges of zero-shot model quantization in scenarios where original training data are unavailable, particularly the issues of noisy synthetic data, predictions driven by non-target patterns, and misleading hard labels. To this end, the authors propose SynQ, a novel framework that uniquely integrates three mechanisms: low-pass filtering to suppress noise in synthetic data, class activation map alignment to preserve semantic consistency, and soft-label distillation applied exclusively to hard samples to avoid erroneous guidance. By systematically tackling these core problems, SynQ achieves state-of-the-art accuracy, significantly outperforming existing methods across multiple benchmarks.

Technology Category

Application Category

πŸ“ Abstract
How can we accurately quantize a pre-trained model without any data? Quantization algorithms are widely used for deploying neural networks on resource-constrained edge devices. Zero-shot Quantization (ZSQ) addresses the crucial and practical scenario where training data are inaccessible for privacy or security reasons. However, three significant challenges hinder the performance of existing ZSQ methods: 1) noise in the synthetic dataset, 2) predictions based on off-target patterns, and the 3) misguidance by erroneous hard labels. In this paper, we propose SynQ (Synthesis-aware Fine-tuning for Zero-shot Quantization), a carefully designed ZSQ framework to overcome the limitations of existing methods. SynQ minimizes the noise from the generated samples by exploiting a low-pass filter. Then, SynQ trains the quantized model to improve accuracy by aligning its class activation map with the pre-trained model. Furthermore, SynQ mitigates misguidance from the pre-trained model's error by leveraging only soft labels for difficult samples. Extensive experiments show that SynQ provides the state-of-the-art accuracy, over existing ZSQ methods.
Problem

Research questions and friction points this paper is trying to address.

Zero-shot Quantization
Data-free Quantization
Model Compression
Neural Network Quantization
Privacy-preserving Deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot Quantization
Synthetic Data
Soft Labels
Class Activation Map Alignment
Low-pass Filtering
πŸ”Ž Similar Papers
No similar papers found.