🤖 AI Summary
This work proposes Joint Source-Channel-Generative Coding (JSCGC), a novel framework that integrates generative models into joint source-channel coding to overcome the limitations of traditional distortion-based communication systems, which often fail to capture human visual perception and yield blurry or semantically distorted reconstructions. By replacing deterministic decoding with probabilistic generation, JSCGC leverages mutual information maximization and controlled stochastic sampling under channel constraints to achieve high-fidelity, semantically consistent image reconstruction. Theoretical analysis establishes a lower bound on semantic inconsistency, revealing fundamental limits of generative communication. Experimental results demonstrate that JSCGC significantly outperforms conventional distortion-oriented JSCC methods in both perceptual quality and semantic fidelity for image transmission tasks.
📝 Abstract
Conventional communication systems, including both separation-based coding and AI-driven joint source-channel coding (JSCC), are largely guided by Shannon's rate-distortion theory. However, relying on generic distortion metrics fails to capture complex human visual perception, often resulting in blurred or unrealistic reconstructions. In this paper, we propose Joint Source-Channel-Generation Coding (JSCGC), a novel paradigm that shifts the focus from deterministic reconstruction to probabilistic generation. JSCGC leverages a generative model at the receiver as a generator rather than a conventional decoder to parameterize the data distribution, enabling direct maximization of mutual information under channel constraints while controlling stochastic sampling to produce outputs residing on the authentic data manifold with high fidelity. We further derive a theoretical lower bound on the maximum semantic inconsistency with given transmitted mutual information, elucidating the fundamental limits of communication in controlling the generative process. Extensive experiments on image transmission demonstrate that JSCGC substantially improves perceptual quality and semantic fidelity, significantly outperforming conventional distortion-oriented JSCC methods.