Semantic Communications with World Models

📅 2025-10-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address severe semantic distortion and reconstruction degradation in semantic video communication under low-bandwidth and dynamic channel conditions, this paper proposes a World Foundation Model (WFM)-based semantic video transmission framework. The framework jointly leverages inter-frame semantic prediction and text-guided future frame generation, incorporating a lightweight deep feedback mechanism for demand-driven transmission. It further introduces segmentation-assisted partial frame restoration and camera-trajectory-driven proactive scheduling—marking the first integration of prediction reliability with real-time channel state optimization. Experimental results demonstrate that the proposed method significantly reduces transmission overhead by an average of 62% across diverse channel conditions, while preserving task-level semantic fidelity, effectively mitigating error accumulation, and enhancing both robustness and efficiency of semantic communication.

Technology Category

Application Category

📝 Abstract
Semantic communication is a promising technique for emerging wireless applications, which reduces transmission overhead by transmitting only task-relevant features instead of raw data. However, existing methods struggle under extremely low bandwidth and varying channel conditions, where corrupted or missing semantics lead to severe reconstruction errors. To resolve this difficulty, we propose a world foundation model (WFM)-aided semantic video transmission framework that leverages the predictive capability of WFMs to generate future frames based on the current frame and textual guidance. This design allows transmissions to be omitted when predictions remain reliable, thereby saving bandwidth. Through WFM's prediction, the key semantics are preserved, yet minor prediction errors tend to amplify over time. To mitigate issue, a lightweight depth-based feedback module is introduced to determine whether transmission of the current frame is needed. Apart from transmitting the entire frame, a segmentation-assisted partial transmission method is proposed to repair degraded frames, which can further balance performance and bandwidth cost. Furthermore, an active transmission strategy is developed for mobile scenarios by exploiting camera trajectory information and proactively scheduling transmissions before channel quality deteriorates. Simulation results show that the proposed framework significantly reduces transmission overhead while maintaining task performances across varying scenarios and channel conditions.
Problem

Research questions and friction points this paper is trying to address.

Semantic communication struggles under low bandwidth and varying channels
Prediction errors from world models amplify over time
Mobile scenarios require proactive transmission scheduling for degraded channels
Innovation

Methods, ideas, or system contributions that make the work stand out.

World foundation model predicts future frames for bandwidth saving
Lightweight depth feedback determines necessary frame transmissions
Segmentation-assisted partial transmission repairs degraded frames
🔎 Similar Papers
No similar papers found.
Peiwen Jiang
Peiwen Jiang
Postdoc, The Hong Kong University of Science and Technology; PhD, Southeast University
AI6Gsemantic communicationLLM
J
Jiajia Guo
Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong
Chao-Kai Wen
Chao-Kai Wen
Institute of Communications Engineering, National Sun Yat-sen University, Taiwan.
Wireless Communication
S
Shi Jin
School of Information Science and Engineering, Southeast University, Nanjing 210096, China
J
Jun Zhang
Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Hong Kong