REVE: A Foundation Model for EEG -- Adapting to Any Setup with Large-Scale Pretraining on 25,000 Subjects

📅 2025-10-24

📈 Citations: 0

✨ Influential: 0

career value

253K/year

🤖 AI Summary

Existing EEG foundation models exhibit poor cross-dataset generalization—particularly in linear probing—due to heterogeneity in acquisition devices, experimental protocols, and electrode configurations. To address this, we propose the first universal foundation model for EEG across all scenarios. Our method introduces a scalable 4D positional encoding scheme supporting arbitrary signal duration and electrode layouts; a largest-ever pretraining paradigm leveraging 92 datasets, 25,000 subjects, and over 60,000 hours of EEG data; and a unified architecture integrating masked self-supervised reconstruction, large-scale contrastive learning, spatiotemporal attention, and a flexible embedding structure adaptable to diverse electrode configurations. Evaluated on ten downstream tasks—including motor imagery, epilepsy detection, and sleep staging—our model achieves state-of-the-art performance. Notably, it enables high-accuracy spatiotemporal pattern modeling via linear probing without fine-tuning.

Technology Category

Application Category

📝 Abstract

Foundation models have transformed AI by reducing reliance on task-specific data through large-scale pretraining. While successful in language and vision, their adoption in EEG has lagged due to the heterogeneity of public datasets, which are collected under varying protocols, devices, and electrode configurations. Existing EEG foundation models struggle to generalize across these variations, often restricting pretraining to a single setup, resulting in suboptimal performance, in particular under linear probing. We present REVE (Representation for EEG with Versatile Embeddings), a pretrained model explicitly designed to generalize across diverse EEG signals. REVE introduces a novel 4D positional encoding scheme that enables it to process signals of arbitrary length and electrode arrangement. Using a masked autoencoding objective, we pretrain REVE on over 60,000 hours of EEG data from 92 datasets spanning 25,000 subjects, representing the largest EEG pretraining effort to date. REVE achieves state-of-the-art results on 10 downstream EEG tasks, including motor imagery classification, seizure detection, sleep staging, cognitive load estimation, and emotion recognition. With little to no fine-tuning, it demonstrates strong generalization, and nuanced spatio-temporal modeling. We release code, pretrained weights, and tutorials to support standardized EEG research and accelerate progress in clinical neuroscience.

Problem

Research questions and friction points this paper is trying to address.

Addressing EEG signal heterogeneity across varying protocols and devices

Overcoming limited generalization of existing EEG foundation models

Enabling flexible processing of arbitrary EEG signal configurations

Innovation

Methods, ideas, or system contributions that make the work stand out.

4D positional encoding for arbitrary EEG setups

Pretrained on 60,000 hours using masked autoencoding

Generalizes across 10 tasks with minimal fine-tuning

🔎 Similar Papers

BrainWave: A Brain Signal Foundation Model for Clinical Applications