Batch Normalization-Free Fully Integer Quantized Neural Networks via Progressive Tandem Learning

📅 2025-12-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Quantized neural networks (QNNs) face challenges in achieving fully integer-only deployment due to their reliance on batch normalization (BN) layers, hindering efficient execution on resource-constrained edge devices. This paper proposes the first BN-free, end-to-end integer-only quantization framework. To replace BN’s functionality, we introduce inter-layer progressive knowledge distillation coupled with a dynamic compensation mechanism; further, we integrate parameter folding and custom integer-only operators to enable pure integer inference. Our method imposes no architectural or initialization constraints, significantly improving stability and accuracy under low-bit quantization. On ImageNet, our 4-bit quantized AlexNet achieves Top-1 accuracy within ±0.3% of the BN-based floating-point baseline—while eliminating all floating-point operations and runtime dependency on batch statistics. This represents the first practical solution for truly integer-only deployment of QNNs without BN.

Technology Category

Application Category

📝 Abstract
Quantised neural networks (QNNs) shrink models and reduce inference energy through low-bit arithmetic, yet most still depend on a running statistics batch normalisation (BN) layer, preventing true integer-only deployment. Prior attempts remove BN by parameter folding or tailored initialisation; while helpful, they rarely recover BN's stability and accuracy and often impose bespoke constraints. We present a BN-free, fully integer QNN trained via a progressive, layer-wise distillation scheme that slots into existing low-bit pipelines. Starting from a pretrained BN-enabled teacher, we use layer-wise targets and progressive compensation to train a student that performs inference exclusively with integer arithmetic and contains no BN operations. On ImageNet with AlexNet, the BN-free model attains competitive Top-1 accuracy under aggressive quantisation. The procedure integrates directly with standard quantisation workflows, enabling end-to-end integer-only inference for resource-constrained settings such as edge and embedded devices.
Problem

Research questions and friction points this paper is trying to address.

Eliminates batch normalization for integer-only neural networks
Enables fully integer inference via progressive layer-wise distillation
Maintains accuracy in aggressive quantization for edge devices
Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive layer-wise distillation for BN-free training
Integer-only inference via tandem learning compensation
Seamless integration into standard quantization workflows
🔎 Similar Papers
P
Pengfei Sun
Department of Information Technology, WA VES Research Group, Ghent University, Ghent, Belgium
W
Wenyu Jiang
Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore
P
Piew Yoong Chee
Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore
Paul Devos
Paul Devos
Universiteit Gent
bioacousticsacousticssoundscapesmachine learninginstrumentation
Dick Botteldooren
Dick Botteldooren
Ghent University
environmental soundoutdoor sound propagationauditory perceptionmachine listening