🤖 AI Summary
This work addresses the limitations of existing EEG foundation models, which struggle to effectively model the low signal-to-noise ratio and complex time-frequency non-stationarity of neural signals while neglecting their intrinsic hierarchical structure, thereby constraining fine-grained reconstruction performance. To overcome these challenges, the authors propose BrainRVQ, a general-purpose foundation model pretrained on large-scale clinical EEG data. Its key innovations include a dual-domain residual vector quantization (DD-RVQ) tokenizer that disentangles time-domain waveforms and spectral patterns into hierarchical discrete codes, coupled with a self-supervised pretraining strategy featuring importance-guided curriculum masking and coarse-to-fine hierarchical autoregressive objectives. Evaluated across eight diverse downstream tasks, BrainRVQ substantially outperforms current state-of-the-art methods, demonstrating its superior capability in learning robust and generalizable neural representations.
📝 Abstract
Developing foundation models for electroencephalography (EEG) remains challenging due to the signal's low signal-to-noise ratio and complex spectro-temporal non-stationarity. Existing approaches often overlook the hierarchical latent structure inherent in neural dynamics, leading to suboptimal reconstruction of fine-grained information. In this work, we propose BrainRVQ, a general-purpose EEG foundation model pre-trained on a large-scale corpus of clinical EEG data. Unlike standard masked modeling, BrainRVQ features a Dual-Domain Residual Vector Quantization (DD-RVQ) tokenizer that disentangles temporal waveforms and spectral patterns into hierarchical discrete codes. We further introduce a hierarchical autoregressive pre-training objective that learns to reconstruct these codes in a coarse-to-fine manner, utilizing an importance-guided curriculum masking strategy to prioritize information-rich neural events over background noise. Extensive experiments across 8 diverse downstream datasets demonstrate that BrainRVQ consistently outperforms state-of-the-art baselines, validating its effectiveness in learning robust and generalizable neural representations. Our code and model weights are available:https://github.com/keqicmz/BrainRVQ