OSF: On Pre-training and Scaling of Sleep Foundation Models

πŸ“… 2026-02-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

215K/year
πŸ€– AI Summary
This study addresses the heterogeneity in sleep physiological signals arising from variations across recording devices and populations by introducing SleepBench, a comprehensive benchmark comprising 166,500 hours of data. The authors systematically evaluate four self-supervised pretraining objectives and propose a channel-invariant feature learning mechanism to enhance model robustness. Through strategic mixing of multi-source data and co-scaling of model capacity with sample size, the approach significantly improves generalization under missing-channel conditions. The resulting OSF model achieves state-of-the-art performance in both sleep staging and disease prediction across nine diverse datasets, substantially outperforming existing methods while demonstrating exceptional sample efficiency and strong cross-dataset transferability.

Technology Category

Application Category

πŸ“ Abstract
Polysomnography (PSG) provides the gold standard for sleep assessment but suffers from substantial heterogeneity across recording devices and cohorts. There have been growing efforts to build general-purpose foundation models (FMs) for sleep physiology, but lack an in-depth understanding of the pre-training process and scaling patterns that lead to more generalizable sleep FMs. To fill this gap, we curate a massive corpus of 166,500 hours of sleep recordings from nine public sources and establish SleepBench, a comprehensive, fully open-source benchmark. Leveraging SleepBench, we systematically evaluate four families of self-supervised pre-training objectives and uncover three critical findings: (1) existing FMs fail to generalize to missing channels at inference; (2) channel-invariant feature learning is essential for pre-training; and (3) scaling sample size, model capacity, and multi-source data mixture consistently improves downstream performance.With an enhanced pre-training and scaling recipe, we introduce OSF, a family of sleep FMs that achieves state-of-the-art performance across nine datasets on diverse sleep and disease prediction tasks. Further analysis of OSF also reveals intriguing properties in sample efficiency, hierarchical aggregation, and cross-dataset scaling.
Problem

Research questions and friction points this paper is trying to address.

foundation models
sleep physiology
pre-training
generalization
polysomnography
Innovation

Methods, ideas, or system contributions that make the work stand out.

foundation models
pre-training
sleep EEG
scaling laws
channel-invariant learning
πŸ’Ό Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid