A Multi-dimensional Framework for Evaluating Generalization in EEG Foundation Models

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the gap in evaluating EEG foundation models under realistic biomedical constraints—such as limited labeled data, restricted electrode channels, and the need for parameter-efficient adaptation—which are often overlooked when relying solely on full fine-tuning with high-quality datasets. The study proposes the first multidimensional evaluation framework tailored to practical deployment scenarios, systematically benchmarking prominent EEG foundation models (e.g., LaBraM, CSBrain, CBraMod) against supervised baselines across six datasets. Evaluations span low-resource settings, short- and long-duration tasks, and varying sensor configurations. Results reveal that foundation models excel significantly in long-context tasks like sleep staging, yet in short-window brain–computer interface applications, lightweight supervised models achieve comparable performance, thereby delineating the current applicability boundaries and suggesting targeted directions for future model refinement.

📝 Abstract

Evaluating foundation models under appropriate adaptation settings is essential for understanding the quality and transferability of the learned representations. Recent EEG foundation models have demonstrated promising transfer capabilities across tasks and datasets, motivating their growing use in neurotechnology and clinical applications. However, these models are typically evaluated under full fine-tuning on well-curated downstream datasets, a setting that does not reflect biomedical domain constraints such as limited labeled data, reduced sensor coverage, or parameter-efficient adaptation. In this work, we propose a multi-dimensional evaluation framework for assessing EEG models under realistic low-resource conditions. Empirical analysis of both supervised EEG models and recent EEG foundation models, including LaBraM, CSBrain, and CBraMod, across 6 different datasets is performed under the proposed multi-dimensional evaluation framework. We find that EEG foundation models consistently provide performance gains on long-context tasks such as sleep stage prediction and mental health state classification. In contrast, for short-window Brain Computer Interface style tasks, supervised models achieve comparable despite having substantially fewer parameters. Additional analyses demonstrate that current foundation models provide limited robustness to short-window tasks and channel constrained settings. Together, these findings motivate the use of multi-dimensional evaluation protocols that characterize model behavior under realistic use constraints.

Problem

Research questions and friction points this paper is trying to address.

EEG foundation models

generalization

low-resource evaluation

transferability

realistic constraints

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-dimensional evaluation

EEG foundation models

low-resource adaptation