REVE: A Foundation Model for EEG -- Adapting to Any Setup with Large-Scale Pretraining on 25,000 Subjects

📅 2025-10-24
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
Existing EEG foundation models exhibit poor cross-dataset generalization—particularly in linear probing—due to heterogeneity in acquisition devices, experimental protocols, and electrode configurations. To address this, we propose the first universal foundation model for EEG across all scenarios. Our method introduces a scalable 4D positional encoding scheme supporting arbitrary signal duration and electrode layouts; a largest-ever pretraining paradigm leveraging 92 datasets, 25,000 subjects, and over 60,000 hours of EEG data; and a unified architecture integrating masked self-supervised reconstruction, large-scale contrastive learning, spatiotemporal attention, and a flexible embedding structure adaptable to diverse electrode configurations. Evaluated on ten downstream tasks—including motor imagery, epilepsy detection, and sleep staging—our model achieves state-of-the-art performance. Notably, it enables high-accuracy spatiotemporal pattern modeling via linear probing without fine-tuning.

Technology Category

Application Category

📝 Abstract
Foundation models have transformed AI by reducing reliance on task-specific data through large-scale pretraining. While successful in language and vision, their adoption in EEG has lagged due to the heterogeneity of public datasets, which are collected under varying protocols, devices, and electrode configurations. Existing EEG foundation models struggle to generalize across these variations, often restricting pretraining to a single setup, resulting in suboptimal performance, in particular under linear probing. We present REVE (Representation for EEG with Versatile Embeddings), a pretrained model explicitly designed to generalize across diverse EEG signals. REVE introduces a novel 4D positional encoding scheme that enables it to process signals of arbitrary length and electrode arrangement. Using a masked autoencoding objective, we pretrain REVE on over 60,000 hours of EEG data from 92 datasets spanning 25,000 subjects, representing the largest EEG pretraining effort to date. REVE achieves state-of-the-art results on 10 downstream EEG tasks, including motor imagery classification, seizure detection, sleep staging, cognitive load estimation, and emotion recognition. With little to no fine-tuning, it demonstrates strong generalization, and nuanced spatio-temporal modeling. We release code, pretrained weights, and tutorials to support standardized EEG research and accelerate progress in clinical neuroscience.
Problem

Research questions and friction points this paper is trying to address.

Addressing EEG signal heterogeneity across varying protocols and devices
Overcoming limited generalization of existing EEG foundation models
Enabling flexible processing of arbitrary EEG signal configurations
Innovation

Methods, ideas, or system contributions that make the work stand out.

4D positional encoding for arbitrary EEG setups
Pretrained on 60,000 hours using masked autoencoding
Generalizes across 10 tasks with minimal fine-tuning
🔎 Similar Papers
Y
Yassine El Ouahidi
IMT Atlantique, Lab-STICC, UMR CNRS 6285, F-29238 Brest, France
J
Jonathan Lys
IMT Atlantique, Lab-STICC, UMR CNRS 6285, F-29238 Brest, France
P
Philipp Thölke
Psychology Department, Université de Montréal, Montreal, QC, Canada
N
Nicolas Farrugia
IMT Atlantique, Lab-STICC, UMR CNRS 6285, F-29238 Brest, France
Bastien Pasdeloup
Bastien Pasdeloup
IMT Atlantique
Signal processing on graphs
Vincent Gripon
Vincent Gripon
IMT Atlantique and Lab-STICC
Deep LearningFew-Shot LearningArtificial Intelligence
K
Karim Jerbi
Psychology Department, Université de Montréal, Montreal, QC, Canada; Mila (Quebec AI research institute), Montreal, QC, Canada; UNIQUE (Quebec Neuro-AI research center), QC, Canada
G
Giulia Lioi
IMT Atlantique, Lab-STICC, UMR CNRS 6285, F-29238 Brest, France