Data-Efficient Sleep Staging with Synthetic Time Series Pretraining

📅 2024-03-13

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

To address the challenges of few-shot, cross-subject EEG sleep staging—characterized by severe inter-subject variability and scarce labeled data—this paper proposes a novel frequency-based pretraining paradigm that requires no real annotated EEG recordings. Our method synthesizes physiologically inspired time-series signals and formulates a spectral prediction task for self-supervised learning, enabling CNN- or Transformer-based models to acquire robust frequency-domain representations. The key contribution is the first introduction of “frequency pretraining” as a general-purpose representation learning framework, decoupling high-quality feature learning from dependence on large-scale real EEG datasets. Experiments demonstrate substantial performance gains over fully supervised baselines under extreme few-shot settings (≤5 subjects) and maintain competitiveness in multi-subject evaluation. Ablation studies confirm the critical role of spectral priors, while also verifying that the model effectively leverages non-spectral features—demonstrating comprehensive representational capacity and strong generalization across subjects.

Technology Category

Application Category

📝 Abstract

Analyzing electroencephalographic (EEG) time series can be challenging, especially with deep neural networks, due to the large variability among human subjects and often small datasets. To address these challenges, various strategies, such as self-supervised learning, have been suggested, but they typically rely on extensive empirical datasets. Inspired by recent advances in computer vision, we propose a pretraining task termed"frequency pretraining"to pretrain a neural network for sleep staging by predicting the frequency content of randomly generated synthetic time series. Our experiments demonstrate that our method surpasses fully supervised learning in scenarios with limited data and few subjects, and matches its performance in regimes with many subjects. Furthermore, our results underline the relevance of frequency information for sleep stage scoring, while also demonstrating that deep neural networks utilize information beyond frequencies to enhance sleep staging performance, which is consistent with previous research. We anticipate that our approach will be advantageous across a broad spectrum of applications where EEG data is limited or derived from a small number of subjects, including the domain of brain-computer interfaces.

Problem

Research questions and friction points this paper is trying to address.

Addressing EEG variability and small datasets

Improving data-efficient sleep staging with synthetic pretraining

Enhancing performance in limited data and few subjects

Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic time series pretraining for sleep staging

Frequency pretraining with neural networks

Outperforms supervised learning in data-limited scenarios

🔎 Similar Papers

Generalizable Sleep Staging via Multi-Level Domain Alignment