🤖 AI Summary
EEG analysis faces challenges including scarce labeled data, high dimensionality with low spatial resolution, and insufficient modeling of spatiotemporal dependencies; existing self-supervised methods often decouple spatial and temporal features, limiting representational capacity. To address this, we propose the first adaptation of the Video Joint-Embedding Predictive Architecture (V-JEPA) to EEG classification, introducing a physiology-aware adaptive masking mechanism that enables interpretable modeling of semantically meaningful spatiotemporal patterns. By treating EEG signals as video-like sequences, our method learns compact, discriminative spatiotemporal representations via joint-embedding prediction. Evaluated on the TUH Abnormal EEG dataset, it achieves significant performance gains over current state-of-the-art methods. Moreover, the learned representations are physiologically interpretable—revealing clinically relevant neural dynamics—thereby establishing a novel paradigm for human-AI collaborative clinical diagnosis.
📝 Abstract
EEG signals capture brain activity with high temporal and low spatial resolution, supporting applications such as neurological diagnosis, cognitive monitoring, and brain-computer interfaces. However, effective analysis is hindered by limited labeled data, high dimensionality, and the absence of scalable models that fully capture spatiotemporal dependencies. Existing self-supervised learning (SSL) methods often focus on either spatial or temporal features, leading to suboptimal representations. To this end, we propose EEG-VJEPA, a novel adaptation of the Video Joint Embedding Predictive Architecture (V-JEPA) for EEG classification. By treating EEG as video-like sequences, EEG-VJEPA learns semantically meaningful spatiotemporal representations using joint embeddings and adaptive masking. To our knowledge, this is the first work that exploits V-JEPA for EEG classification and explores the visual concepts learned by the model. Evaluations on the publicly available Temple University Hospital (TUH) Abnormal EEG dataset show that EEG-VJEPA outperforms existing state-of-the-art models in classification accuracy.Beyond classification accuracy, EEG-VJEPA captures physiologically relevant spatial and temporal signal patterns, offering interpretable embeddings that may support human-AI collaboration in diagnostic workflows. These findings position EEG-VJEPA as a promising framework for scalable, trustworthy EEG analysis in real-world clinical settings.