🤖 AI Summary
Current wireless foundation models struggle to unify diverse communication tasks across heterogeneous scenarios and input/output modalities. To address this, we propose MUSE-FM—the first environment-aware, multi-task foundation model for wireless communications. It introduces a prompt-guided unified encoder-decoder architecture to standardize arbitrary input/output formats and incorporates multimodal environmental context as prior knowledge to enable cross-scenario feature extraction and multi-task co-optimization. By jointly modeling environment-aware features and training across tasks—including channel estimation, signal detection, and resource scheduling—MUSE-FM consistently outperforms state-of-the-art methods. Experiments demonstrate that explicit environmental context significantly enhances cross-scenario generalization, while the prompt mechanism enables rapid zero-shot or few-shot adaptation to unseen tasks. MUSE-FM thus achieves both strong adaptability and scalable extensibility in dynamic wireless environments.
📝 Abstract
Recent advancements in foundation models (FMs) have attracted increasing attention in the wireless communication domain. Leveraging the powerful multi-task learning capability, FMs hold the promise of unifying multiple tasks of wireless communication with a single framework. with a single framework. Nevertheless, existing wireless FMs face limitations in the uniformity to address multiple tasks with diverse inputs/outputs across different communication scenarios.In this paper, we propose a MUlti-taSk Environment-aware FM (MUSE-FM) with a unified architecture to handle multiple tasks in wireless communications, while effectively incorporating scenario information.Specifically, to achieve task uniformity, we propose a unified prompt-guided data encoder-decoder pair to handle data with heterogeneous formats and distributions across different tasks. Besides, we integrate the environmental context as a multi-modal input, which serves as prior knowledge of environment and channel distributions and facilitates cross-scenario feature extraction. Simulation results illustrate that the proposed MUSE-FM outperforms existing methods for various tasks, and its prompt-guided encoder-decoder pair improves the scalability for new task configurations. Moreover, the incorporation of environment information improves the ability to adapt to different scenarios.