Progressive Element-wise Gradient Estimation for Neural Network Quantization

📅 2025-08-27

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Neural network quantization at ultra-low bit-widths suffers from significant accuracy degradation due to the non-differentiability of discrete quantization operations, causing standard quantization-aware training (QAT) — which relies on the straight-through estimator (STE) — to ignore quantization-induced discretization errors. To address this, we propose a progressive element-wise gradient estimation framework. Our method innovatively introduces a logarithmic curriculum-driven mixed-precision replacement mechanism that jointly optimizes task loss and quantization discretization error. By integrating progressive variable substitution, element-level gradient calibration, and explicit discretization error modeling, we establish an end-to-end co-optimization framework. The approach is plug-and-play and fully compatible with diverse forward quantization strategies. Evaluated on CIFAR-10 and ImageNet, our method achieves full-precision accuracy—or even surpasses it—on ResNet and VGG models quantized to 2–4 bits, consistently outperforming state-of-the-art QAT approaches.

Technology Category

Application Category

📝 Abstract

Neural network quantization aims to reduce the bit-widths of weights and activations, making it a critical technique for deploying deep neural networks on resource-constrained hardware. Most Quantization-Aware Training (QAT) methods rely on the Straight-Through Estimator (STE) to address the non-differentiability of discretization functions by replacing their derivatives with that of the identity function. While effective, STE overlooks discretization errors between continuous and quantized values, which can lead to accuracy degradation -- especially at extremely low bit-widths. In this paper, we propose Progressive Element-wise Gradient Estimation (PEGE), a simple yet effective alternative to STE, which can be seamlessly integrated with any forward propagation methods and improves the quantized model accuracy. PEGE progressively replaces full-precision weights and activations with their quantized counterparts via a novel logarithmic curriculum-driven mixed-precision replacement strategy. Then it formulates QAT as a co-optimization problem that simultaneously minimizes the task loss for prediction and the discretization error for quantization, providing a unified and generalizable framework. Extensive experiments on CIFAR-10 and ImageNet across various architectures (e.g., ResNet, VGG) demonstrate that PEGE consistently outperforms existing backpropagation methods and enables low-precision models to match or even outperform the accuracy of their full-precision counterparts.

Problem

Research questions and friction points this paper is trying to address.

Addresses non-differentiability in neural network quantization

Reduces discretization errors between continuous and quantized values

Improves accuracy degradation at extremely low bit-widths

Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive logarithmic curriculum-driven mixed-precision replacement

Co-optimizes task loss and discretization error simultaneously

Element-wise gradient estimation alternative to STE

🔎 Similar Papers

Gradient-based Automatic Mixed Precision Quantization for Neural Networks On-Chip