Conditional Denoising Diffusion Autoencoders for Wireless Semantic Communications

📅 2025-09-26

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Existing semantic communication systems predominantly rely on coupled autoencoder architectures, which struggle to model true signal distributions and suffer from poor scalability. This paper introduces diffusion autoencoders to wireless semantic communication for the first time, proposing a “semantic latent variable + conditional diffusion (CDiff)” framework: a neural semantic encoder extracts high-dimensional semantic representations at the transmitter, while the receiver reconstructs the signal space via a score-based denoising process conditioned on the semantic latent variable. This design decouples encoding and decoding, theoretically guaranteeing the decoder’s consistent estimation of the true data distribution. Experiments on CIFAR-10 and MNIST demonstrate significantly superior reconstruction performance over conventional autoencoders and VAEs. The framework further extends successfully to multi-user scenarios, identifying channel distortion and semantic prior mismatch as dominant practical impairments.

Technology Category

Application Category

📝 Abstract

Semantic communication (SemCom) systems aim to learn the mapping from low-dimensional semantics to high-dimensional ground-truth. While this is more akin to a "domain translation" problem, existing frameworks typically emphasize on channel-adaptive neural encoding-decoding schemes, lacking full exploration of signal distribution. Moreover, such methods so far have employed autoencoder-based architectures, where the encoding is tightly coupled to a matched decoder, causing scalability issues in practice. To address these gaps, diffusion autoencoder models are proposed for wireless SemCom. The goal is to learn a "semantic-to-clean" mapping, from the semantic space to the ground-truth probability distribution. A neural encoder at semantic transmitter extracts the high-level semantics, and a conditional diffusion model (CDiff) at the semantic receiver exploits the source distribution for signal-space denoising, while the received semantic latents are incorporated as the conditioning input to "steer" the decoding process towards the semantics intended by the transmitter. It is analytically proved that the proposed decoder model is a consistent estimator of the ground-truth data. Furthermore, extensive simulations over CIFAR-10 and MNIST datasets are provided along with design insights, highlighting the performance compared to legacy autoencoders and variational autoencoders (VAE). Simulations are further extended to the multi-user SemCom, identifying the dominating factors in a more realistic setup.

Problem

Research questions and friction points this paper is trying to address.

Addresses limitations in semantic communication signal distribution modeling

Proposes diffusion autoencoders to decouple encoder-decoder dependencies

Enables semantic-to-clean mapping for wireless multi-user communications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion autoencoder models for semantic communication

Conditional diffusion model for signal-space denoising

Semantic latents steer decoding towards intended semantics

🔎 Similar Papers

Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints