M2BeamLLM: Multimodal Sensing-empowered mmWave Beam Prediction with Large Language Models

📅 2025-06-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Beam prediction for millimeter-wave (mmWave) MIMO systems in vehicle-to-infrastructure (V2I) cooperative driving remains challenging due to dynamic channel variations and limited training data. Method: This paper proposes a novel beam selection framework that fuses multimodal perception data—RGB images, radar point clouds, LiDAR scans, and GPS trajectories. It introduces GPT-2 as the first large language model (LLM) for this task, integrated within a multimodal encoder and cross-modal alignment architecture. The framework supports few-shot generalization, with performance improving as modality diversity increases. End-to-end beam prediction is achieved via supervised fine-tuning. Contribution/Results: The method significantly outperforms conventional deep learning baselines in both standard and few-shot settings, achieving substantial gains in prediction accuracy and robustness. It establishes a scalable, adaptive paradigm for intelligent beam management in V2I mmWave communications, enabling reliable high-bandwidth links under real-world mobility constraints.

Technology Category

Application Category

📝 Abstract
This paper introduces a novel neural network framework called M2BeamLLM for beam prediction in millimeter-wave (mmWave) massive multi-input multi-output (mMIMO) communication systems. M2BeamLLM integrates multi-modal sensor data, including images, radar, LiDAR, and GPS, leveraging the powerful reasoning capabilities of large language models (LLMs) such as GPT-2 for beam prediction. By combining sensing data encoding, multimodal alignment and fusion, and supervised fine-tuning (SFT), M2BeamLLM achieves significantly higher beam prediction accuracy and robustness, demonstrably outperforming traditional deep learning (DL) models in both standard and few-shot scenarios. Furthermore, its prediction performance consistently improves with increased diversity in sensing modalities. Our study provides an efficient and intelligent beam prediction solution for vehicle-to-infrastructure (V2I) mmWave communication systems.
Problem

Research questions and friction points this paper is trying to address.

Predicting mmWave beams using multimodal sensor data
Enhancing beam prediction accuracy with LLMs
Improving V2I communication robustness via diverse sensing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal sensor data integration for beam prediction
Leveraging LLMs like GPT-2 for reasoning
Supervised fine-tuning enhances accuracy and robustness
🔎 Similar Papers
No similar papers found.
Can Zheng
Can Zheng
University of Pittsburgh
Data MiningNatural Language ProcessingMedical AI
Jiguang He
Jiguang He
Associate Professor, Great Bay University & Adjunct Professor, University of Oulu
6GISACPositioningRISmmWave
C
Chung G. Kang
Department of Electrical and Computer Engineering, Korea University, Seoul 02841, South Korea
G
Guofa Cai
School of Information Engineering, Guangdong University of Technology, Guangzhou, China
Zitong Yu
Zitong Yu
U.S. Food and Drug Administration
Medical imagingDeep learningMachine learningImage reconstruction
M
Mérouane Debbah
Center for 6G Technology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates