CSBrain: A Cross-scale Spatiotemporal Brain Foundation Model for EEG Decoding

📅 2025-06-28

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

EEG decoding suffers from inadequate modeling of cross-scale spatiotemporal dynamics: existing foundation models adopt scale-agnostic, dense paradigms inherited from NLP/vision, failing to capture the multiscale nature of neural activity—from millisecond-scale local bursts to slow, whole-brain rhythmic interactions. To address this, we propose CSBrain, the first EEG foundation model incorporating neuroscience-informed cross-scale spatiotemporal tokenization (CST) and structured sparse attention (SSA). CST explicitly encodes temporal hierarchies and spatial functional organization, while SSA suppresses spurious correlations via biologically grounded sparsity constraints. CSBrain employs an alternating stacked architecture and is pretrained on large-scale EEG data. Evaluated across 16 datasets and 11 diverse EEG tasks, it consistently outperforms both task-specific models and prior foundation models, demonstrating superior generalization. Our work establishes cross-scale spatiotemporal modeling as a fundamental inductive bias for brain foundation models.

Technology Category

Application Category

📝 Abstract

Understanding and decoding brain activity from electroencephalography (EEG) signals is a fundamental challenge in neuroscience and AI, with applications in cognition, emotion recognition, diagnosis, and brain-computer interfaces. While recent EEG foundation models advance generalized decoding via unified architectures and large-scale pretraining, they adopt a scale-agnostic dense modeling paradigm inherited from NLP and vision. This design neglects a core property of neural activity: cross-scale spatiotemporal structure. EEG task patterns span a wide range of temporal and spatial scales, from short bursts to slow rhythms, and from localized cortical responses to distributed interactions. Ignoring this diversity leads to suboptimal representations and weak generalization. We propose CSBrain, a Cross-scale Spatiotemporal Brain foundation model for generalized EEG decoding. CSBrain introduces: (i) Cross-scale Spatiotemporal Tokenization (CST), which aggregates multi-scale features from localized temporal windows and anatomical brain regions into compact scale-aware tokens; and (ii) Structured Sparse Attention (SSA), which captures cross-window and cross-region dependencies, enhancing scale diversity while removing spurious correlations. CST and SSA are alternately stacked to progressively integrate multi-scale dependencies. Experiments on 11 EEG tasks across 16 datasets show that CSBrain consistently outperforms task-specific and foundation model baselines. These results establish cross-scale modeling as a key inductive bias and position CSBrain as a robust backbone for future brain-AI research.

Problem

Research questions and friction points this paper is trying to address.

Decoding diverse EEG signals across temporal and spatial scales

Overcoming limitations of scale-agnostic EEG foundation models

Improving generalization for brain activity interpretation tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-scale Spatiotemporal Tokenization for EEG features

Structured Sparse Attention captures multi-scale dependencies

Alternate stacking integrates cross-window and cross-region data

🔎 Similar Papers

sEMG-Based Joint Angle Estimation via Hierarchical Spiking Attentional Feature Decomposition Network