Error-Resilient Semantic Communication for Speech Transmission over Packet-Loss Networks

πŸ“… 2025-12-08
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

210K/year
πŸ€– AI Summary
Conventional channel protection mechanisms struggle with packet loss under bandwidth- and latency-constrained wireless networks, leading to insufficient robustness for real-time speech. To address this, we propose Glarisβ€”a semantic communication framework compatible with existing digital communication systems that enhances error resilience. Its core innovation lies in a generative latent-prior-guided latent-space speech encoder, jointly optimizing semantic fidelity and reconstruction quality within the generative model’s latent space. Additionally, Glaris incorporates a lightweight forward error correction scheme coupled with a latent-prior-driven packet-loss concealment mechanism to suppress error propagation and enable semantic-level fault tolerance. Experiments on LibriSpeech demonstrate that Glaris achieves joint source-channel coding (JSCC)-level robustness with significantly lower redundancy overhead, striking a superior trade-off between transmission efficiency and speech quality.

Technology Category

Application Category

πŸ“ Abstract
Real-time speech communication over wireless networks remains challenging, as conventional channel protection mechanisms cannot effectively counter packet loss under stringent bandwidth and latency constraints. Semantic communication has emerged as a promising paradigm for enhancing the robustness of speech transmission by means of joint source-channel coding (JSCC). However, its cross-layer design hinders practical deployment due to the incompatibility with existing digital communication systems. In this case, the robustness of speech communication is consequently evaluated primarily by the error-resilience to packet loss over wireless networks. To address these challenges, we propose emph{Glaris}, a generative latent-prior-based resilient speech semantic communication framework that performs resilient speech coding in the generative latent space. Generative latent priors enable high-quality packet loss concealment (PLC) at the receiver side, well-balancing semantic consistency and reconstruction fidelity. Additionally, an integrated error resilience mechanism is designed to mitigate the error propagation and improve the effectiveness of PLC. Compared with traditional packet-level forward error correction (FEC) strategies, our new method achieves enhanced robustness over dynamic wireless networks while reducing redundancy overhead significantly. Experimental results on the LibriSpeech dataset demonstrate that emph{Glaris} consistently outperforms existing error-resilient codecs, achieving JSCC-level robustness while maintaining seamless compatibility with existing systems, and it also strikes a favorable balance between transmission efficiency and speech reconstruction quality.
Problem

Research questions and friction points this paper is trying to address.

Enhances speech transmission robustness over packet-loss wireless networks.
Balances semantic consistency and reconstruction fidelity using generative latent priors.
Reduces redundancy overhead while maintaining compatibility with existing systems.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative latent-prior framework for resilient speech coding
Integrated error resilience mechanism to mitigate propagation
Achieves JSCC robustness with reduced redundancy overhead
πŸ”Ž Similar Papers
No similar papers found.