CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding

📅 2024-12-10

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

240K/year

🤖 AI Summary

Existing EEG foundation models suffer from two key limitations: (1) neglecting the heterogeneity of spatiotemporal dependencies in EEG signals, and (2) poor generalization across multi-source, heterogeneous EEG data formats. To address these challenges, we propose the first general-purpose foundation model for EEG decoding. Our method introduces three core innovations: (1) a cross-shaped Transformer architecture that explicitly decouples spatial and temporal dependency modeling; (2) an asymmetric conditional positional encoding scheme that adaptively accommodates diverse EEG formats—varying in sampling rate, channel count, and experimental paradigm; and (3) a patch-based masked reconstruction pretraining objective coupled with a dual-path attention mechanism. Evaluated on 10 downstream BCI tasks across 12 public datasets, our model achieves state-of-the-art performance, demonstrating substantial improvements in generalization capability and robustness to domain shifts. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

Electroencephalography (EEG) is a non-invasive technique to measure and record brain electrical activity, widely used in various BCI and healthcare applications. Early EEG decoding methods rely on supervised learning, limited by specific tasks and datasets, hindering model performance and generalizability. With the success of large language models, there is a growing body of studies focusing on EEG foundation models. However, these studies still leave challenges: Firstly, most of existing EEG foundation models employ full EEG modeling strategy. It models the spatial and temporal dependencies between all EEG patches together, but ignores that the spatial and temporal dependencies are heterogeneous due to the unique structural characteristics of EEG signals. Secondly, existing EEG foundation models have limited generalizability on a wide range of downstream BCI tasks due to varying formats of EEG data, making it challenging to adapt to. To address these challenges, we propose a novel foundation model called CBraMod. Specifically, we devise a criss-cross transformer as the backbone to thoroughly leverage the structural characteristics of EEG signals, which can model spatial and temporal dependencies separately through two parallel attention mechanisms. And we utilize an asymmetric conditional positional encoding scheme which can encode positional information of EEG patches and be easily adapted to the EEG with diverse formats. CBraMod is pre-trained on a very large corpus of EEG through patch-based masked EEG reconstruction. We evaluate CBraMod on up to 10 downstream BCI tasks (12 public datasets). CBraMod achieves the state-of-the-art performance across the wide range of tasks, proving its strong capability and generalizability. The source code is publicly available at https://github.com/wjq-learning/CBraMod.

Problem

Research questions and friction points this paper is trying to address.

Improves EEG decoding model generalizability

Addresses heterogeneous spatial-temporal dependencies

Enhances adaptability to diverse EEG data formats

Innovation

Methods, ideas, or system contributions that make the work stand out.

Criss-cross transformer models EEG dependencies.

Asymmetric positional encoding adapts to EEG formats.

Masked EEG reconstruction pre-trains foundation model.

🔎 Similar Papers

BrainWave: A Brain Signal Foundation Model for Clinical Applications