π€ AI Summary
Modeling complex spatiotemporal dynamics in large-scale MEG data remains challenging. To address this, we propose MEG-GPTβthe first Transformer-based large language model designed for continuous neural time-series signals. Its key innovations include: (1) a data-driven tokenization scheme preserving millisecond-level temporal resolution; (2) a time-aware attention mechanism coupled with autoregressive next-timepoint prediction for efficient unsupervised representation learning; and (3) zero-shot generalization across sessions and subjects. Evaluated on the Cam-CAN dataset, MEG-GPT synthesizes realistic MEG signals exhibiting authentic spatiotemporal spectral characteristics. In downstream decoding tasks, it achieves cross-session accuracy of 0.59 (+5% over baseline) and cross-subject accuracy of 0.49 (+8%), significantly outperforming conventional approaches. These results demonstrate MEG-GPTβs capacity to learn rich, transferable representations from raw, continuous neurophysiological signals without task-specific supervision.
π Abstract
Modelling the complex spatiotemporal patterns of large-scale brain dynamics is crucial for neuroscience, but traditional methods fail to capture the rich structure in modalities such as magnetoencephalography (MEG). Recent advances in deep learning have enabled significant progress in other domains, such as language and vision, by using foundation models at scale. Here, we introduce MEG-GPT, a transformer based foundation model that uses time-attention and next time-point prediction. To facilitate this, we also introduce a novel data-driven tokeniser for continuous MEG data, which preserves the high temporal resolution of continuous MEG signals without lossy transformations. We trained MEG-GPT on tokenised brain region time-courses extracted from a large-scale MEG dataset (N=612, eyes-closed rest, Cam-CAN data), and show that the learnt model can generate data with realistic spatio-spectral properties, including transient events and population variability. Critically, it performs well in downstream decoding tasks, improving downstream supervised prediction task, showing improved zero-shot generalisation across sessions (improving accuracy from 0.54 to 0.59) and subjects (improving accuracy from 0.41 to 0.49) compared to a baseline methods. Furthermore, we show the model can be efficiently fine-tuned on a smaller labelled dataset to boost performance in cross-subject decoding scenarios. This work establishes a powerful foundation model for electrophysiological data, paving the way for applications in computational neuroscience and neural decoding.