M2BeamLLM: Multimodal Sensing-empowered mmWave Beam Prediction with Large Language Models

📅 2025-06-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Beam prediction for millimeter-wave (mmWave) MIMO systems in vehicle-to-infrastructure (V2I) cooperative driving remains challenging due to dynamic channel variations and limited training data. Method: This paper proposes a novel beam selection framework that fuses multimodal perception data—RGB images, radar point clouds, LiDAR scans, and GPS trajectories. It introduces GPT-2 as the first large language model (LLM) for this task, integrated within a multimodal encoder and cross-modal alignment architecture. The framework supports few-shot generalization, with performance improving as modality diversity increases. End-to-end beam prediction is achieved via supervised fine-tuning. Contribution/Results: The method significantly outperforms conventional deep learning baselines in both standard and few-shot settings, achieving substantial gains in prediction accuracy and robustness. It establishes a scalable, adaptive paradigm for intelligent beam management in V2I mmWave communications, enabling reliable high-bandwidth links under real-world mobility constraints.

Technology Category

Application Category

📝 Abstract

This paper introduces a novel neural network framework called M2BeamLLM for beam prediction in millimeter-wave (mmWave) massive multi-input multi-output (mMIMO) communication systems. M2BeamLLM integrates multi-modal sensor data, including images, radar, LiDAR, and GPS, leveraging the powerful reasoning capabilities of large language models (LLMs) such as GPT-2 for beam prediction. By combining sensing data encoding, multimodal alignment and fusion, and supervised fine-tuning (SFT), M2BeamLLM achieves significantly higher beam prediction accuracy and robustness, demonstrably outperforming traditional deep learning (DL) models in both standard and few-shot scenarios. Furthermore, its prediction performance consistently improves with increased diversity in sensing modalities. Our study provides an efficient and intelligent beam prediction solution for vehicle-to-infrastructure (V2I) mmWave communication systems.

Problem

Research questions and friction points this paper is trying to address.

Predicting mmWave beams using multimodal sensor data

Enhancing beam prediction accuracy with LLMs

Improving V2I communication robustness via diverse sensing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal sensor data integration for beam prediction

Leveraging LLMs like GPT-2 for reasoning

Supervised fine-tuning enhances accuracy and robustness

🔎 Similar Papers

No similar papers found.

Authors to Follow