Semantic Communications with World Models

📅 2025-10-26

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address severe semantic distortion and reconstruction degradation in semantic video communication under low-bandwidth and dynamic channel conditions, this paper proposes a World Foundation Model (WFM)-based semantic video transmission framework. The framework jointly leverages inter-frame semantic prediction and text-guided future frame generation, incorporating a lightweight deep feedback mechanism for demand-driven transmission. It further introduces segmentation-assisted partial frame restoration and camera-trajectory-driven proactive scheduling—marking the first integration of prediction reliability with real-time channel state optimization. Experimental results demonstrate that the proposed method significantly reduces transmission overhead by an average of 62% across diverse channel conditions, while preserving task-level semantic fidelity, effectively mitigating error accumulation, and enhancing both robustness and efficiency of semantic communication.

Technology Category

Application Category

📝 Abstract

Semantic communication is a promising technique for emerging wireless applications, which reduces transmission overhead by transmitting only task-relevant features instead of raw data. However, existing methods struggle under extremely low bandwidth and varying channel conditions, where corrupted or missing semantics lead to severe reconstruction errors. To resolve this difficulty, we propose a world foundation model (WFM)-aided semantic video transmission framework that leverages the predictive capability of WFMs to generate future frames based on the current frame and textual guidance. This design allows transmissions to be omitted when predictions remain reliable, thereby saving bandwidth. Through WFM's prediction, the key semantics are preserved, yet minor prediction errors tend to amplify over time. To mitigate issue, a lightweight depth-based feedback module is introduced to determine whether transmission of the current frame is needed. Apart from transmitting the entire frame, a segmentation-assisted partial transmission method is proposed to repair degraded frames, which can further balance performance and bandwidth cost. Furthermore, an active transmission strategy is developed for mobile scenarios by exploiting camera trajectory information and proactively scheduling transmissions before channel quality deteriorates. Simulation results show that the proposed framework significantly reduces transmission overhead while maintaining task performances across varying scenarios and channel conditions.

Problem

Research questions and friction points this paper is trying to address.

Semantic communication struggles under low bandwidth and varying channels

Prediction errors from world models amplify over time

Mobile scenarios require proactive transmission scheduling for degraded channels

Innovation

Methods, ideas, or system contributions that make the work stand out.

World foundation model predicts future frames for bandwidth saving

Lightweight depth feedback determines necessary frame transmissions

Segmentation-assisted partial transmission repairs degraded frames

🔎 Similar Papers

No similar papers found.