Error-Resilient Semantic Communication for Speech Transmission over Packet-Loss Networks

📅 2025-12-08

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Conventional channel protection mechanisms struggle with packet loss under bandwidth- and latency-constrained wireless networks, leading to insufficient robustness for real-time speech. To address this, we propose Glaris—a semantic communication framework compatible with existing digital communication systems that enhances error resilience. Its core innovation lies in a generative latent-prior-guided latent-space speech encoder, jointly optimizing semantic fidelity and reconstruction quality within the generative model’s latent space. Additionally, Glaris incorporates a lightweight forward error correction scheme coupled with a latent-prior-driven packet-loss concealment mechanism to suppress error propagation and enable semantic-level fault tolerance. Experiments on LibriSpeech demonstrate that Glaris achieves joint source-channel coding (JSCC)-level robustness with significantly lower redundancy overhead, striking a superior trade-off between transmission efficiency and speech quality.

Technology Category

Application Category

📝 Abstract

Real-time speech communication over wireless networks remains challenging, as conventional channel protection mechanisms cannot effectively counter packet loss under stringent bandwidth and latency constraints. Semantic communication has emerged as a promising paradigm for enhancing the robustness of speech transmission by means of joint source-channel coding (JSCC). However, its cross-layer design hinders practical deployment due to the incompatibility with existing digital communication systems. In this case, the robustness of speech communication is consequently evaluated primarily by the error-resilience to packet loss over wireless networks. To address these challenges, we propose emph{Glaris}, a generative latent-prior-based resilient speech semantic communication framework that performs resilient speech coding in the generative latent space. Generative latent priors enable high-quality packet loss concealment (PLC) at the receiver side, well-balancing semantic consistency and reconstruction fidelity. Additionally, an integrated error resilience mechanism is designed to mitigate the error propagation and improve the effectiveness of PLC. Compared with traditional packet-level forward error correction (FEC) strategies, our new method achieves enhanced robustness over dynamic wireless networks while reducing redundancy overhead significantly. Experimental results on the LibriSpeech dataset demonstrate that emph{Glaris} consistently outperforms existing error-resilient codecs, achieving JSCC-level robustness while maintaining seamless compatibility with existing systems, and it also strikes a favorable balance between transmission efficiency and speech reconstruction quality.

Problem

Research questions and friction points this paper is trying to address.

Enhances speech transmission robustness over packet-loss wireless networks.

Balances semantic consistency and reconstruction fidelity using generative latent priors.

Reduces redundancy overhead while maintaining compatibility with existing systems.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative latent-prior framework for resilient speech coding

Integrated error resilience mechanism to mitigate propagation

Achieves JSCC robustness with reduced redundancy overhead

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

2026 Fall Applied Science Internship - Natural Language Processing and Speech Technologies - United States, PhD Student Science Recruiting

Amazon

Bellevue, WA / Boston, MA / Cambridge, MA

Authors to Follow