Towards Bitrate-Efficient and Noise-Robust Speech Coding with Variable Bitrate RVQ

📅 2025-06-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address rigid bit-rate allocation and redundant bit consumption by noise components in residual vector quantization (RVQ) speech coding under realistic noisy conditions, this paper proposes a variable-bit-rate RVQ (VRVQ) framework. Our method introduces: (1) the first noise-aware, dynamic frame-level bit-rate allocation mechanism, which adaptively adjusts quantization precision based on speech saliency and noise intensity; and (2) the first end-to-end jointly optimized neural feature-domain denoiser integrated into the RVQ quantization loop. Under rate-distortion joint optimization, VRVQ significantly improves coding efficiency across diverse noise scenarios: at equal bit rates, it achieves a mean opinion score (MOS) gain of ≥0.8 over baseline methods, with superior speech intelligibility and subjective quality compared to constant-bit-rate RVQ (CBR-RVQ) and conventional codecs.

Technology Category

Application Category

📝 Abstract
Residual Vector Quantization (RVQ) has become a dominant approach in neural speech and audio coding, providing high-fidelity compression. However, speech coding presents additional challenges due to real-world noise, which degrades compression efficiency. Standard codecs allocate bits uniformly, wasting bitrate on noise components that do not contribute to intelligibility. This paper introduces a Variable Bitrate RVQ (VRVQ) framework for noise-robust speech coding, dynamically adjusting bitrate per frame to optimize rate-distortion trade-offs. Unlike constant bitrate (CBR) RVQ, our method prioritizes critical speech components while suppressing residual noise. Additionally, we integrate a feature denoiser to further improve noise robustness. Experimental results show that VRVQ improves rate-distortion trade-offs over conventional methods, achieving better compression efficiency and perceptual quality in noisy conditions. Samples are available at our project page: https://yoongi43.github.io/noise_robust_vrvq/.
Problem

Research questions and friction points this paper is trying to address.

Optimize bitrate allocation for noisy speech coding
Enhance compression efficiency by suppressing residual noise
Improve noise robustness with dynamic bitrate adjustment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Variable Bitrate RVQ for dynamic bitrate adjustment
Feature denoiser enhances noise robustness
Prioritizes critical speech components over noise
🔎 Similar Papers
2024-10-08IEEE International Conference on Acoustics, Speech, and Signal ProcessingCitations: 0