🤖 AI Summary
This work addresses the significant performance degradation of EEG foundation models under real-world distribution shifts across devices, populations, and recording scenarios. To this end, we present NeuroAdapt-Bench, the first benchmark for test-time adaptation (TTA) tailored to EEG foundation models, enabling systematic evaluation of representative TTA approaches—including both gradient-based and optimization-free strategies—across multiple pretrained models, downstream tasks, and heterogeneous datasets encompassing conventional, out-of-distribution, and extreme modalities such as Ear-EEG. Our experiments reveal that standard TTA methods, particularly gradient-based ones, often yield unstable or even detrimental performance on EEG tasks, whereas optimization-free methods consistently deliver reliable improvements, highlighting their practical promise for neural signal processing in clinical settings.
📝 Abstract
Electroencephalography (EEG) foundation models have shown strong potential for learning generalizable representations from large-scale neural data, yet their clinical deployment is hindered by distribution shifts across clinical settings, devices, and populations. Test-time adaptation (TTA) offers a promising solution by enabling models to adapt to unlabeled target data during inference without access to source data, a valuable property in healthcare settings constrained by privacy regulations and limited labeled data. However, its effectiveness for EEG remains largely underexplored. In this work, we introduce NeuroAdapt-Bench, a systematic benchmark for evaluating test-time adaptation methods on EEG foundation models under realistic distribution shifts. We evaluate representative TTA approaches from other domains across multiple pretrained foundation models, diverse downstream tasks, and heterogeneous datasets spanning in-distribution, out-of-distribution, and extreme modality shifts (e.g., Ear-EEG). Our results show that standard TTA methods yield inconsistent gains and often degrade performance, with gradient-based approaches particularly prone to heavy degradation. In contrast, optimization-free methods demonstrate greater stability and more reliable improvements. These findings highlight the limitations of existing TTA techniques in EEG, provide guidance for future development, and underscore the need for domain-specific adaptation strategies.