🤖 AI Summary
This work addresses the challenge of semantic communication in bandwidth-constrained low-altitude UAV downlink scenarios, where deep fading events in dynamic channels often cause complete reconstruction failure in conventional approaches. To mitigate this, the authors propose a structure–texture semantic disentanglement mechanism that decomposes content into deterministic structural components and stochastic textural details. Leveraging channel state prediction, they design a hierarchical transmission strategy that prioritizes the delivery of structural information during high-reliability time slots, thereby enabling differentiated semantic protection. The system integrates multi-stream variational encoding–decoding, a channel prediction model, and generative reconstruction techniques. Experimental results demonstrate that the proposed framework preserves structural integrity even under significant channel prediction mismatch, achieving a 5.6 dB gain in peak signal-to-noise ratio over a single-stream baseline.
📝 Abstract
Unmanned aerial vehicle (UAV) downlink transmission facilitates critical time-sensitive visual applications but is fundamentally constrained by bandwidth scarcity and dynamic channel impairments. The rapid fluctuation of the air-to-ground (A2G) link creates a regime where reliable transmission slots are intermittent and future channel quality can only be predicted with uncertainty. Conventional deep joint source-channel coding (DeepJSCC) methods transmit coupled feature streams, causing global reconstruction failure when specific time slots experience deep fading. Decoupling semantic content into a deterministic structure component and a stochastic texture component enables differentiated error protection strategies aligned with channel reliability. A predictive transmission framework is developed that utilizes a split-stream variational codec and a channel-aware scheduler to prioritize the delivery of structural layout over reliable slots. Experimental evaluations indicate that this approach achieves a 5.6 dB gain in peak signal-to-noise (SNR) ratio over single-stream baselines and maintains structural fidelity under significant prediction mismatch.