๐ค AI Summary
To address the challenge of jointly optimizing computational efficiency, robustness, and reconstruction quality for image transmission over dynamic 6G channels, this paper proposes a semantic-aware deep joint source-channel coding (JSCC) framework. The method introduces a differentiable end-to-end architecture that synergistically integrates foreground semantic extraction with adaptive background synthesis: MediaPipe-based pose estimation localizes semantically critical regions; Rembg and a pre-trained background library enable lightweight background reconstruction. This achieves foreground-focused encoding and background-simplified synthesis in a unified optimization. Experiments demonstrate that the proposed approach significantly outperforms conventional JSCC methods in PSNR under low SNR conditions, reduces transmission overhead by approximately 35%, and maintains high visual fidelityโmaking it particularly suitable for resource-constrained mobile multimedia applications.
๐ Abstract
The advent of 6G networks demands unprecedented levels of intelligence, adaptability, and efficiency to address challenges such as ultra-high-speed data transmission, ultra-low latency, and massive connectivity in dynamic environments. Traditional wireless image transmission frameworks, reliant on static configurations and isolated source-channel coding, struggle to balance computational efficiency, robustness, and quality under fluctuating channel conditions. To bridge this gap, this paper proposes an AI-native deep joint source-channel coding (JSCC) framework tailored for resource-constrained 6G networks. Our approach integrates key information extraction and adaptive background synthesis to enable intelligent, semantic-aware transmission. Leveraging AI-driven tools, Mediapipe for human pose detection and Rembg for background removal, the model dynamically isolates foreground features and matches backgrounds from a pre-trained library, reducing data payloads while preserving visual fidelity. Experimental results demonstrate significant improvements in peak signal-to-noise ratio (PSNR) compared with traditional JSCC method, especially under low-SNR conditions. This approach offers a practical solution for multimedia services in resource-constrained mobile communications.