Generative AI-Enhanced Multi-Modal Semantic Communication in Internet of Vehicles: System Design and Methodologies

📅 2024-09-24
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the inefficiency in information transmission caused by high multi-modal perception data loads and dynamic, unstable channels in vehicle-infrastructure cooperative (V2X) systems, this paper proposes Generative AI-Enhanced Multi-modal Semantic Communication (G-MSC). G-MSC is the first framework to deeply integrate diffusion models and multi-modal large language models into semantic communication, establishing a hybrid analog-digital transmission mechanism and a task-adaptive semantic encoding-decoding architecture. It jointly optimizes cross-modal semantic alignment, noise-robust decoding, and channel-state-driven transmission mode switching. Experimental results on predictive V2X tasks demonstrate that G-MSC reduces communication overhead by 62%, improves semantic accuracy by 31%, and achieves a packet-loss resilience rate of 98.7%, significantly overcoming the generalization bottleneck of conventional semantic communication in dynamic V2X environments.

Technology Category

Application Category

📝 Abstract
Vehicle-to-everything (V2X) communication supports numerous tasks, from driving safety to entertainment services. To achieve a holistic view, vehicles are typically equipped with multiple sensors to compensate for undetectable blind spots. However, processing large volumes of multi-modal data increases transmission load, while the dynamic nature of vehicular networks adds to transmission instability. To address these challenges, we propose a novel framework, Generative Artificial intelligence (GAI)-enhanced multi-modal semantic communication (SemCom), referred to as G-MSC, designed to handle various vehicular network tasks by employing suitable analog or digital transmission. GAI presents a promising opportunity to transform the SemCom framework by significantly enhancing semantic encoding to facilitate the optimized integration of multi-modal information, enhancing channel robustness, and fortifying semantic decoding against noise interference. To validate the effectiveness of the G-MSC framework, we conduct a case study showcasing its performance in vehicular communication networks for predictive tasks. The experimental results show that the design achieves reliable and efficient communication in V2X networks. In the end, we present future research directions on G-MSC.
Problem

Research questions and friction points this paper is trying to address.

Vehicular Networks
Information Exchange
Data Management
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Artificial Intelligence
Multi-modal Information Communication
V2X Network Optimization
🔎 Similar Papers
No similar papers found.
Jiayi Lu
Jiayi Lu
Beihang University
Autonomous VehicleComputer VisionSOTIFADAS
Wanting Yang
Wanting Yang
Research Scientist, Singapore University of Technology and Design
Semantic communicationMartingale theoryEdge computing and intelliegence
Zehui Xiong
Zehui Xiong
Professor, Queen's University Belfast
Edge IntelligenceInternet of ThingsWireless NetworkingBlockchainMetaverse
C
Chengwen Xing
School of Information and Electronics, Beijing Institute of Technology, Beijing, China
R
Rahim Tafazolli
Institute for Communication Systems (ICS), 5/6GIC, The University of Surrey, UK
T
Tony Q. S. Quek
Pillar of Information Systems Technology and Design, Singapore University of Technology and Design, Singapore
M
M. Debbah
KU 6G Research Center, Khalifa University of Science and Technology, P O Box 127788, Abu Dhabi, UAE; CentraleSupelec, University Paris-Saclay, 91192 Gif-sur-Yvette, France