π€ AI Summary
To address the severe performance degradation in post-training quantization (PTQ) of Vision Transformers (ViTs) caused by coupled activation and weight quantization errors, this paper proposes ERQβa two-stage error suppression framework. In the first stage, Aqer mitigates activation quantization error via reparameterized initialization and closed-form ridge regression calibration. In the second stage, Wqer suppresses weight quantization outliers through dual-uniform quantization and iterative rounding-direction proxy optimization. ERQ requires no fine-tuning, achieving an excellent balance between accuracy and efficiency. Under the W3A4 setting, ViT-S attains a 36.81% higher top-1 accuracy on ImageNet than GPTQ, substantially outperforming existing PTQ methods. Moreover, ERQ demonstrates strong generalization across diverse ViT architectures and downstream tasks.
π Abstract
Post-training quantization (PTQ) for vision transformers (ViTs) has received increasing attention from both academic and industrial communities due to its minimal data needs and high time efficiency. However, many current methods fail to account for the complex interactions between quantized weights and activations, resulting in significant quantization errors and suboptimal performance. This paper presents ERQ, an innovative two-step PTQ method specifically crafted to reduce quantization errors arising from activation and weight quantization sequentially. The first step, Activation quantization error reduction (Aqer), first applies Reparameterization Initialization aimed at mitigating initial quantization errors in high-variance activations. Then, it further mitigates the errors by formulating a Ridge Regression problem, which updates the weights maintained at full-precision using a closed-form solution. The second step, Weight quantization error reduction (Wqer), first applies Dual Uniform Quantization to handle weights with numerous outliers, which arise from adjustments made during Reparameterization Initialization, thereby reducing initial weight quantization errors. Then, it employs an iterative approach to further tackle the errors. In each iteration, it adopts Rounding Refinement that uses an empirically derived, efficient proxy to refine the rounding directions of quantized weights, complemented by a Ridge Regression solver to reduce the errors. Comprehensive experimental results demonstrate ERQ's superior performance across various ViTs variants and tasks. For example, ERQ surpasses the state-of-the-art GPTQ by a notable 36.81% in accuracy for W3A4 ViT-S. Our codes are available at https://github.com/zysxmu/ERQ.