Enhancing Generalization in Data-free Quantization via Mixup-class Prompting

📅 2025-07-29

📈 Citations: 0

✨ Influential: 0

career value

145K/year

🤖 AI Summary

To address the limited generalization of synthetic images and semantic ambiguity induced by single-class text prompts—leading to performance degradation in data-free quantization (DFQ)—this work first uncovers the intrinsic correlation between synthetic image quality and quantized model generalizability. We propose a novel *mixup-class prompt* strategy: within a unified text-conditioned diffusion-GAN framework, multi-class semantic information is fused at the prompt level, jointly optimized via gradient norm regularization and generalization-error-driven objectives to generate robust, diverse synthetic data. This approach substantially mitigates optimization instability in post-training quantization (PTQ). Under extremely low-bit settings (e.g., W2A4), it surpasses state-of-the-art methods including GenQ, achieving superior accuracy on both CNN and ViT architectures.

Technology Category

Application Category

📝 Abstract

Post-training quantization (PTQ) improves efficiency but struggles with limited calibration data, especially under privacy constraints. Data-free quantization (DFQ) mitigates this by generating synthetic images using generative models such as generative adversarial networks (GANs) and text-conditioned latent diffusion models (LDMs), while applying existing PTQ algorithms. However, the relationship between generated synthetic images and the generalizability of the quantized model during PTQ remains underexplored. Without investigating this relationship, synthetic images generated by previous prompt engineering methods based on single-class prompts suffer from issues such as polysemy, leading to performance degradation. We propose extbf{mixup-class prompt}, a mixup-based text prompting strategy that fuses multiple class labels at the text prompt level to generate diverse, robust synthetic data. This approach enhances generalization, and improves optimization stability in PTQ. We provide quantitative insights through gradient norm and generalization error analysis. Experiments on convolutional neural networks (CNNs) and vision transformers (ViTs) show that our method consistently outperforms state-of-the-art DFQ methods like GenQ. Furthermore, it pushes the performance boundary in extremely low-bit scenarios, achieving new state-of-the-art accuracy in challenging 2-bit weight, 4-bit activation (W2A4) quantization.

Problem

Research questions and friction points this paper is trying to address.

Improves generalization in data-free quantization with mixup-class prompts

Addresses performance degradation from single-class prompt synthetic images

Enhances optimization stability in post-training quantization scenarios

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixup-class prompt for diverse synthetic data

Enhances generalization in data-free quantization

Improves stability in post-training quantization

🔎 Similar Papers

MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity