π€ AI Summary
To address the insufficient robustness of oversampled baseband signal demodulation under impulsive noise channels, this work challenges the conventional paradigm that treats intersymbol contribution (ISC) as interference, instead modeling ISC as deterministic contextual information embedded within the waveform. We propose a Transformer-based masked symbol modeling framework: complex-valued baseband sequences undergo random masking and reconstruction pretraining, leveraging bidirectional self-attention to capture long-range waveform dependencies and learn implicit βwaveform syntax.β This enables semantic-level contextual reasoning over physical-layer signals, allowing the receiver to infer severely distorted symbol segments from neighboring samples under impulsive noise corruption. Experimental results demonstrate that the proposed context-aware demodulator significantly improves bit error rate performance, validating the feasibility and effectiveness of transforming ISC into structured prior knowledge.
π Abstract
Recent breakthroughs in natural language processing show that attention mechanism in Transformer networks, trained via masked-token prediction, enables models to capture the semantic context of the tokens and internalize the grammar of language. While the application of Transformers to communication systems is a burgeoning field, the notion of context within physical waveforms remains under-explored. This paper addresses that gap by re-examining inter-symbol contribution (ISC) caused by pulse-shaping overlap. Rather than treating ISC as a nuisance, we view it as a deterministic source of contextual information embedded in oversampled complex baseband signals. We propose Masked Symbol Modeling (MSM), a framework for the physical (PHY) layer inspired by Bidirectional Encoder Representations from Transformers methodology. In MSM, a subset of symbol aligned samples is randomly masked, and a Transformer predicts the missing symbol identifiers using the surrounding "in-between" samples. Through this objective, the model learns the latent syntax of complex baseband waveforms. We illustrate MSM's potential by applying it to the task of demodulating signals corrupted by impulsive noise, where the model infers corrupted segments by leveraging the learned context. Our results suggest a path toward receivers that interpret, rather than merely detect communication signals, opening new avenues for context-aware PHY layer design.