Enhancing Quantization-Aware Training on Edge Devices via Relative Entropy Coreset Selection and Cascaded Layer Correction

๐Ÿ“… 2025-07-16
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address data scarcity and accumulated quantization errors in low-bit quantization for edge devices, this paper proposes QuaRC, an efficient quantization-aware training (QAT) framework. Methodologically, QuaRC introduces two key innovations: (1) a high-representativeness core set constructed via relative entropy scoring, drastically reducing training data requirements; and (2) a cascaded inter-layer error correction mechanism that aligns intermediate feature outputs between full-precision and quantized models, mitigating accuracy degradation under extreme data scarcity. Under the stringent constraint of using only 1% of ImageNet-1K (i.e., ~13k samples), QuaRC achieves a 5.72% absolute Top-1 accuracy improvement over the prior state-of-the-art for 2-bit ResNet-18โ€”substantially outperforming conventional QAT methods. By jointly optimizing data efficiency and quantization robustness, QuaRC establishes a new paradigm for high-accuracy, ultra-low-bit model training in resource-constrained edge environments.

Technology Category

Application Category

๐Ÿ“ Abstract
With the development of mobile and edge computing, the demand for low-bit quantized models on edge devices is increasing to achieve efficient deployment. To enhance the performance, it is often necessary to retrain the quantized models using edge data. However, due to privacy concerns, certain sensitive data can only be processed on edge devices. Therefore, employing Quantization-Aware Training (QAT) on edge devices has become an effective solution. Nevertheless, traditional QAT relies on the complete dataset for training, which incurs a huge computational cost. Coreset selection techniques can mitigate this issue by training on the most representative subsets. However, existing methods struggle to eliminate quantization errors in the model when using small-scale datasets (e.g., only 10% of the data), leading to significant performance degradation. To address these issues, we propose QuaRC, a QAT framework with coresets on edge devices, which consists of two main phases: In the coreset selection phase, QuaRC introduces the ``Relative Entropy Score" to identify the subsets that most effectively capture the model's quantization errors. During the training phase, QuaRC employs the Cascaded Layer Correction strategy to align the intermediate layer outputs of the quantized model with those of the full-precision model, thereby effectively reducing the quantization errors in the intermediate layers. Experimental results demonstrate the effectiveness of our approach. For instance, when quantizing ResNet-18 to 2-bit using a 1% data subset, QuaRC achieves a 5.72% improvement in Top-1 accuracy on the ImageNet-1K dataset compared to state-of-the-art techniques.
Problem

Research questions and friction points this paper is trying to address.

Enhancing quantization-aware training on edge devices
Reducing computational cost with coreset selection
Minimizing quantization errors in small-scale datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Relative Entropy Score for coreset selection
Cascaded Layer Correction for error reduction
Quantization-Aware Training on edge devices
๐Ÿ”Ž Similar Papers
No similar papers found.
Yujia Tong
Yujia Tong
Wuhan University of Technology
Machine LearningEfficient Computing
J
Jingling Yuan
Hubei Key Laboratory of Transportation Internet of Things, School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Hubei 430072, China
C
Chuang Hu
School of Computer Science, Wuhan University, Hubei 430072, China