🤖 AI Summary
Existing EEG models are task-specific, exhibit poor generalization, and suffer from inefficient training. Method: We propose NeuroLM—the first multitask foundation model treating EEG signals as a “neural language.” It employs a text-aligned vector-quantized time-frequency predictive tokenizer to discretize continuous EEG into tokens and adopts a cross-modal architecture combining a frozen EEG encoder with a trainable large language model (LLM), enabling instruction-tuned multitask learning. Contributions/Results: NeuroLM is the first unified foundation model supporting six heterogeneous downstream tasks—including brain signal decoding, intent recognition, and anomaly detection—under a single architecture. It introduces a novel neural–linguistic cross-modal alignment paradigm. Its largest variant, NeuroLM-XL, contains 1.7 billion parameters, making it the largest EEG foundation model to date. Experiments demonstrate significant improvements over state-of-the-art methods across six benchmark datasets, with strong zero-shot and few-shot transfer capabilities.
📝 Abstract
Recent advancements for large-scale pre-training with neural signals such as electroencephalogram (EEG) have shown promising results, significantly boosting the development of brain-computer interfaces (BCIs) and healthcare. However, these pre-trained models often require full fine-tuning on each downstream task to achieve substantial improvements, limiting their versatility and usability, and leading to considerable resource wastage. To tackle these challenges, we propose NeuroLM, the first multi-task foundation model that leverages the capabilities of Large Language Models (LLMs) by regarding EEG signals as a foreign language, endowing the model with multi-task learning and inference capabilities. Our approach begins with learning a text-aligned neural tokenizer through vector-quantized temporal-frequency prediction, which encodes EEG signals into discrete neural tokens. These EEG tokens, generated by the frozen vector-quantized (VQ) encoder, are then fed into an LLM that learns causal EEG information via multi-channel autoregression. Consequently, NeuroLM can understand both EEG and language modalities. Finally, multi-task instruction tuning adapts NeuroLM to various downstream tasks. We are the first to demonstrate that, by specific incorporation with LLMs, NeuroLM unifies diverse EEG tasks within a single model through instruction tuning. The largest variant NeuroLM-XL has record-breaking 1.7B parameters for EEG signal processing, and is pre-trained on a large-scale corpus comprising approximately 25,000-hour EEG data. When evaluated on six diverse downstream datasets, NeuroLM showcases the huge potential of this multi-task learning paradigm.