🤖 AI Summary
To address challenges in point cloud transmission for autonomous driving and extended reality—including noise, incompleteness, bandwidth constraints, low signal-to-noise ratio (SNR), and reliance on error-free side information—this paper proposes GenSeC-PC, a channel-adaptive cross-modal generative semantic communication framework. GenSeC-PC introduces a novel semantic encoder that jointly leverages non-transmitted multi-view images and the source point cloud; it employs a PointDif decoder integrated with a rectified denoising diffusion implicit model (RDDIM) to achieve millisecond-level real-time reconstruction. An asymmetric joint semantic-channel coding scheme, coupled with dual SNR/bandwidth feedback, eliminates dependence on error-free side information. Furthermore, generative priors and dual-metric-guided fine-tuning significantly enhance robustness under low-SNR conditions, partial point cloud availability, and unseen object scenarios. Experiments demonstrate superior compression efficiency, higher reconstruction fidelity, and decoding latency reduced to the millisecond scale.
📝 Abstract
With the rapid development of autonomous driving and extended reality, efficient transmission of point clouds (PCs) has become increasingly important. In this context, we propose a novel channel-adaptive cross-modal generative semantic communication (SemCom) for PC transmission, called GenSeC-PC. GenSeC-PC employs a semantic encoder that fuses images and point clouds, where images serve as non-transmitted side information. Meanwhile, the decoder is built upon the backbone of PointDif. Such a cross-modal design not only ensures high compression efficiency but also delivers superior reconstruction performance compared to PointDif. Moreover, to ensure robust transmission and reduce system complexity, we design a streamlined and asymmetric channel-adaptive joint semantic-channel coding architecture, where only the encoder needs the feedback of average signal-to-noise ratio (SNR) and available bandwidth. In addition, rectified denoising diffusion implicit models is employed to accelerate the decoding process to the millisecond level, enabling real-time PC communication. Unlike existing methods, GenSeC-PC leverages generative priors to ensure reliable reconstruction even from noisy or incomplete source PCs. More importantly, it supports fully analog transmission, improving compression efficiency by eliminating the need for error-free side information transmission common in prior SemCom approaches. Simulation results confirm the effectiveness of cross-modal semantic extraction and dual-metric guided fine-tuning, highlighting the framework's robustness across diverse conditions, including low SNR, bandwidth limitations, varying numbers of 2D images, and previously unseen objects.