Channel-adaptive Cross-modal Generative Semantic Communication for Point Cloud Transmission

📅 2025-06-03

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

To address challenges in point cloud transmission for autonomous driving and extended reality—including noise, incompleteness, bandwidth constraints, low signal-to-noise ratio (SNR), and reliance on error-free side information—this paper proposes GenSeC-PC, a channel-adaptive cross-modal generative semantic communication framework. GenSeC-PC introduces a novel semantic encoder that jointly leverages non-transmitted multi-view images and the source point cloud; it employs a PointDif decoder integrated with a rectified denoising diffusion implicit model (RDDIM) to achieve millisecond-level real-time reconstruction. An asymmetric joint semantic-channel coding scheme, coupled with dual SNR/bandwidth feedback, eliminates dependence on error-free side information. Furthermore, generative priors and dual-metric-guided fine-tuning significantly enhance robustness under low-SNR conditions, partial point cloud availability, and unseen object scenarios. Experiments demonstrate superior compression efficiency, higher reconstruction fidelity, and decoding latency reduced to the millisecond scale.

Technology Category

Application Category

📝 Abstract

With the rapid development of autonomous driving and extended reality, efficient transmission of point clouds (PCs) has become increasingly important. In this context, we propose a novel channel-adaptive cross-modal generative semantic communication (SemCom) for PC transmission, called GenSeC-PC. GenSeC-PC employs a semantic encoder that fuses images and point clouds, where images serve as non-transmitted side information. Meanwhile, the decoder is built upon the backbone of PointDif. Such a cross-modal design not only ensures high compression efficiency but also delivers superior reconstruction performance compared to PointDif. Moreover, to ensure robust transmission and reduce system complexity, we design a streamlined and asymmetric channel-adaptive joint semantic-channel coding architecture, where only the encoder needs the feedback of average signal-to-noise ratio (SNR) and available bandwidth. In addition, rectified denoising diffusion implicit models is employed to accelerate the decoding process to the millisecond level, enabling real-time PC communication. Unlike existing methods, GenSeC-PC leverages generative priors to ensure reliable reconstruction even from noisy or incomplete source PCs. More importantly, it supports fully analog transmission, improving compression efficiency by eliminating the need for error-free side information transmission common in prior SemCom approaches. Simulation results confirm the effectiveness of cross-modal semantic extraction and dual-metric guided fine-tuning, highlighting the framework's robustness across diverse conditions, including low SNR, bandwidth limitations, varying numbers of 2D images, and previously unseen objects.

Problem

Research questions and friction points this paper is trying to address.

Efficient transmission of point clouds for autonomous driving and extended reality applications

Robust reconstruction from noisy or incomplete point cloud sources

Real-time communication with high compression and low system complexity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-modal fusion of images and point clouds

Channel-adaptive joint semantic-channel coding

Rectified denoising diffusion for fast decoding

🔎 Similar Papers

Semantic Communication for Efficient Point Cloud Transmission