🤖 AI Summary
To address the low accuracy, poor efficiency, and weak compatibility of image transmission over MIMO channels under adverse channel conditions, this paper proposes a semantic communication framework compatible with standard codecs. The framework preserves the conventional separation-based architecture and introduces two learnable modules: a Preprocessing Enhancement Network (PPEN) and a Processing-and-Channel Enhancement Network (PCEN), enabling the first joint optimization of deep neural networks with non-differentiable standard codecs (e.g., JPEG for source coding and LDPC for channel coding). By leveraging proxy gradient approximation and MIMO-channel-adaptive semantic encoding, it avoids end-to-end retraining and supports plug-and-play deployment. Experiments demonstrate over 29% bandwidth savings compared to baseline methods, significantly reduced computational complexity, and strong generalization performance on unseen datasets and tasks.
📝 Abstract
Joint source-channel coding (JSCC) is a promising paradigm for next-generation communication systems, particularly in challenging transmission environments. In this paper, we propose a novel standard-compatible JSCC framework for the transmission of images over multiple-input multiple-output (MIMO) channels. Different from the existing end-to-end AI-based DeepJSCC schemes, our framework consists of learnable modules that enable communication using conventional separate source and channel codes (SSCC), which makes it amenable for easy deployment on legacy systems. Specifically, the learnable modules involve a preprocessing-empowered network (PPEN) for preserving essential semantic information, and a precoder &combiner-enhanced network (PCEN) for efficient transmission over a resource-constrained MIMO channel. We treat existing compression and channel coding modules as non-trainable blocks. Since the parameters of these modules are non-differentiable, we employ a proxy network that mimics their operations when training the learnable modules. Numerical results demonstrate that our scheme can save more than 29% of the channel bandwidth, and requires lower complexity compared to the constrained baselines. We also show its generalization capability to unseen datasets and tasks through extensive experiments.