Semantic-Aware Visual Information Transmission With Key Information Extraction Over Wireless Networks

📅 2025-06-15

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

To address the challenge of jointly optimizing computational efficiency, robustness, and reconstruction quality for image transmission over dynamic 6G channels, this paper proposes a semantic-aware deep joint source-channel coding (JSCC) framework. The method introduces a differentiable end-to-end architecture that synergistically integrates foreground semantic extraction with adaptive background synthesis: MediaPipe-based pose estimation localizes semantically critical regions; Rembg and a pre-trained background library enable lightweight background reconstruction. This achieves foreground-focused encoding and background-simplified synthesis in a unified optimization. Experiments demonstrate that the proposed approach significantly outperforms conventional JSCC methods in PSNR under low SNR conditions, reduces transmission overhead by approximately 35%, and maintains high visual fidelity—making it particularly suitable for resource-constrained mobile multimedia applications.

Technology Category

Application Category

📝 Abstract

The advent of 6G networks demands unprecedented levels of intelligence, adaptability, and efficiency to address challenges such as ultra-high-speed data transmission, ultra-low latency, and massive connectivity in dynamic environments. Traditional wireless image transmission frameworks, reliant on static configurations and isolated source-channel coding, struggle to balance computational efficiency, robustness, and quality under fluctuating channel conditions. To bridge this gap, this paper proposes an AI-native deep joint source-channel coding (JSCC) framework tailored for resource-constrained 6G networks. Our approach integrates key information extraction and adaptive background synthesis to enable intelligent, semantic-aware transmission. Leveraging AI-driven tools, Mediapipe for human pose detection and Rembg for background removal, the model dynamically isolates foreground features and matches backgrounds from a pre-trained library, reducing data payloads while preserving visual fidelity. Experimental results demonstrate significant improvements in peak signal-to-noise ratio (PSNR) compared with traditional JSCC method, especially under low-SNR conditions. This approach offers a practical solution for multimedia services in resource-constrained mobile communications.

Problem

Research questions and friction points this paper is trying to address.

Enhance visual data transmission efficiency in 6G networks

Balance computational efficiency and quality in dynamic channels

Reduce data payloads while preserving visual fidelity

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-native deep joint source-channel coding

Key information extraction and adaptive synthesis

Dynamic foreground isolation and background matching

🔎 Similar Papers

No similar papers found.