BeamVLM for Low-altitude Economy: Generative Beam Prediction via Vision-language Models

📅 2026-02-23

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the limited generalization of existing beam prediction methods, which often lack high-level semantic understanding in dynamic environments. To overcome this, we propose the first integration of vision-language models (VLMs) into beam prediction for highly mobile unmanned aerial vehicles (UAVs) communicating with ground base stations. We formulate beam prediction as a visual question answering task, leveraging instruction prompting and an end-to-end mapping from visual patches to the language domain to jointly model fine-grained spatial semantics and high-level reasoning. Evaluated on real-world datasets, our approach significantly outperforms state-of-the-art methods, demonstrating superior accuracy and enhanced generalization across diverse scenarios such as low-altitude economy applications and vehicle-to-infrastructure (V2I) communication.

Technology Category

Application Category

📝 Abstract

For low-altitude economy (LAE), fast and accurate beam prediction between high-mobility unmanned aerial vehicles (UAVs) and ground base stations is of paramount importance, which ensures seamless coverage and reliable communications. However, existing deep learning-based beam prediction methods lack high-level semantic understanding of dynamic environments, resulting in poor generalization. On the other hand, the emerging large language model (LLM) based approaches show promise in enhancing generalization, but they typically lack rich environmental perception, thereby failing to capture fine-grained spatial semantics essential for precise beam alignment. To tackle these limitations, we propose in this correspondence a novel end-to-end generative framework for beam prediction, called BeamVLM, which treats beam prediction as a vision question answering task capitalizing on powerful existing vision-language models (VLMs). By projecting raw visual patches directly into the language domain and judiciously designing an instructional prompt, the proposed BeamVLM enables the VLM to jointly reason over UAV trajectories and environmental context. Last, experimental results on real-world datasets demonstrate that the proposed BeamVLM outperforms state-of-the-art methods in prediction accuracy and also exhibits superior generalization for other scenarios such as vehicle-to-infrastructure (V2I) beam prediction.

Problem

Research questions and friction points this paper is trying to address.

beam prediction

low-altitude economy

unmanned aerial vehicles

vision-language models

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-Language Model

Generative Beam Prediction

Low-altitude Economy