🤖 AI Summary
This work addresses the challenges of learning universal representations for wireless baseband signals, which arise from system heterogeneity, environmental diversity, and scarce labeled data. The authors propose a Transformer-based foundation model operating on time-frequency spectrograms, uniquely integrating a Mixture-of-Experts (MoE) architecture with multi-task self-supervised learning. By jointly optimizing masked modeling and contrastive learning objectives, the model enhances representation quality without relying on extensive annotations. Experimental results demonstrate that the proposed approach significantly outperforms existing deep learning methods in both modulation classification and joint signal-to-noise ratio (SNR) and mobility recognition tasks, achieving robust and transferable signal representations across both data-scarce and data-rich scenarios.
📝 Abstract
The received in-phase and quadrature (I/Q) baseband signals inherently encode physical-layer and channel characteristics of wireless links. Learning robust and transferable representations directly from such raw signals, however, remains challenging due to heterogeneous communication systems, diverse propagation environments, and limited labeled data. To address this, we present LWM-Spectro, a transformer-based foundation model pretrained on large-scale I/Q data represented as time-frequency spectrograms. The model leverages self-supervised masked modeling, contrastive learning, and a mixture-of-experts (MoE) architecture to learn general-purpose wireless representations. These representations transfer effectively to downstream tasks such as modulation classification and joint SNR/mobility recognition, even with minimal supervision. Across tasks, LWM-Spectro consistently outperforms state-of-the-art deep learning baselines in both few-shot and data-rich regimes, providing a unified foundation for wireless learning.