SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling

📅 2025-01-22
🏛️ IEEE Journal on Selected Areas in Communications
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address severe audio quality degradation under high packet-loss rates (up to 40%), this paper proposes SoundSpring—the first semantic audio transceiver system that unifies causal masked language modeling (Causal MLM) for both audio compression and real-time packet-loss concealment. SoundSpring performs joint source-channel optimization in a neural acoustic feature latent space, leveraging causal sequential masking, neural quantization, and digital packetized transmission—departing from conventional end-to-end analog mapping paradigms. Experimental results demonstrate that SoundSpring consistently outperforms state-of-the-art methods across major objective metrics, including PESQ, STOI, and ViSQOL, yielding substantial improvements in perceptual speech quality and robustness. By integrating semantic representation learning with channel-aware transmission design, SoundSpring establishes a novel paradigm for semantic communication over unreliable channels.

Technology Category

Application Category

📝 Abstract
In this paper, we propose"SoundSpring", a cutting-edge error-resilient audio transceiver that marries the robustness benefits of joint source-channel coding (JSCC) while also being compatible with current digital communication systems. Unlike recent deep JSCC transceivers, which learn to directly map audio signals to analog channel-input symbols via neural networks, our SoundSpring adopts the layered architecture that delineates audio compression from digital coded transmission, but it sufficiently exploits the impressive in-context predictive capabilities of large language (foundation) models. Integrated with the casual-order mask learning strategy, our single model operates on the latent feature domain and serve dual-functionalities: as efficient audio compressors at the transmitter and as effective mechanisms for packet loss concealment at the receiver. By jointly optimizing towards both audio compression efficiency and transmission error resiliency, we show that mask-learned language models are indeed powerful contextual predictors, and our dual-functional compression and concealment framework offers fresh perspectives on the application of foundation language models in audio communication. Through extensive experimental evaluations, we establish that SoundSpring apparently outperforms contemporary audio transmission systems in terms of signal fidelity metrics and perceptual quality scores. These new findings not only advocate for the practical deployment of SoundSpring in learning-based audio communication systems but also inspire the development of future audio semantic transceivers.
Problem

Research questions and friction points this paper is trying to address.

Audio Transmission System
Signal Degradation
Packet Loss
Innovation

Methods, ideas, or system contributions that make the work stand out.

Joint Source-Channel Coding
Large Language Models
Audio Semantic Transceivers
🔎 Similar Papers
No similar papers found.
S
Shengshi Yao
Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, China
Jincheng Dai
Jincheng Dai
Beijing University of Posts and Telecommunications
Real-Time CommunicationsWireless MultimediaAI-RANSource and Channel Coding
Xiaoqi Qin
Xiaoqi Qin
Beijing University of Posts and Telecommunications
Sixian Wang
Sixian Wang
Beijing University of Posts and Telecommunications
S
Siye Wang
Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, China
K
Kai Niu
Key Laboratory of Universal Wireless Communications, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing 100876, China
P
Ping Zhang
State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China