Dynamic Quantization Error Propagation in Encoder-Decoder ASR Quantization

📅 2026-01-05

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

This work addresses the issue of error accumulation in layer-wise quantization of encoder-decoder automatic speech recognition (ASR) models, which often leads to performance instability and degraded accuracy. To mitigate this, the authors propose Fine-grained Adaptive Dynamic Error propagation (FADE), a novel post-training quantization method that introduces a runtime-adaptive, cross-layer error correction mechanism tailored to the heterogeneous characteristics of ASR encoders and decoders. FADE dynamically balances local quantization and error compensation to optimize quantization fidelity. Experimental results demonstrate that FADE significantly reduces the variance of word error rate (WER) across multiple runs and achieves consistently lower average WER compared to existing baseline methods.

Technology Category

Application Category

📝 Abstract

Running Automatic Speech Recognition (ASR) models on memory-constrained edge devices requires efficient compression. While layer-wise post-training quantization is effective, it suffers from error accumulation, especially in encoder-decoder architectures. Existing solutions like Quantization Error Propagation (QEP) are suboptimal for ASR due to the model's heterogeneity, processing acoustic features in the encoder while generating text in the decoder. To address this, we propose Fine-grained Alpha for Dynamic Quantization Error Propagation (FADE), which adaptively controls the trade-off between cross-layer error correction and local quantization. Experiments show that FADE significantly improves stability by reducing performance variance across runs, while simultaneously surpassing baselines in mean WER.

Problem

Research questions and friction points this paper is trying to address.

quantization error propagation

encoder-decoder ASR

post-training quantization

edge devices

model heterogeneity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Quantization

Error Propagation

Encoder-Decoder ASR