🤖 AI Summary
To address the low semantic transmission efficiency and the trade-off between channel resource utilization and reconstruction quality in generative-AI-driven wireless semantic communication, this paper proposes a Mixture-of-Semantics (MoS) transmission strategy. MoS leverages semantic segmentation to precisely distinguish Regions of Interest (ROI) from Regions of Non-Interest (RONI), enabling differentiated semantic encoding and transmission: ROI encoding prioritizes structural fidelity, while RONI encoding emphasizes semantic consistency. At the receiver, a diffusion model jointly reconstructs the full image. The framework integrates semantic segmentation, rate-distortion-optimized coding, and generative reconstruction to realize end-to-end semantic-level communication. Experiments demonstrate that MoS achieves a 3.2 dB gain in ROI peak signal-to-noise ratio (PSNR) and an 18.7% improvement in RONI CLIP similarity over baseline methods, marking the first approach in semantic communication to simultaneously optimize visual fidelity and semantic relevance.
📝 Abstract
In this paper, we propose a mixture of semantics (MoS) transmission strategy for wireless semantic communication systems based on generative artificial intelligence (AI). At the transmitter, we divide an image into regions of interest (ROI) and reigons of non-interest (RONI) to extract their semantic information respectively. Semantic information of ROI can be allocated more bandwidth, while RONI can be represented in a compact form for transmission. At the receiver, a diffusion model reconstructs the full image using the received semantic information of ROI and RONI. Compared to existing generative AI-based methods, MoS enables more efficient use of channel resources by balancing visual fidelity and semantic relevance. Experimental results demonstrate that appropriate ROI-RONI allocation is critical. The MoS achieves notable performance gains in peak signal-to-noise ratio (PSNR) of ROI and CLIP score of RONI.