SleepLM: Natural-Language Intelligence for Human Sleep

📅 2026-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited generalizability of existing learning-based sleep analysis systems, which are constrained by closed label spaces and struggle to adapt to novel sleep phenomena or support flexible querying. To overcome this, we propose the first sleep-language foundation model that aligns multimodal polysomnography signals with natural language, enabling linguistic representation and interactive exploration of sleep physiology. We construct a large-scale paired sleep-text dataset and introduce a unified pretraining framework integrating contrastive alignment, descriptive captioning, and signal reconstruction, supported by a multi-level text annotation pipeline. The resulting model significantly outperforms current approaches in zero-shot and few-shot learning, cross-modal retrieval, and sleep description generation, demonstrating strong language-guided analytical capabilities.

Technology Category

Application Category

📝 Abstract
We present SleepLM, a family of sleep-language foundation models that enable human sleep alignment, interpretation, and interaction with natural language. Despite the critical role of sleep, learning-based sleep analysis systems operate in closed label spaces (e.g., predefined stages or events) and fail to describe, query, or generalize to novel sleep phenomena. SleepLM bridges natural language and multimodal polysomnography, enabling language-grounded representations of sleep physiology. To support this alignment, we introduce a multilevel sleep caption generation pipeline that enables the curation of the first large-scale sleep-text dataset, comprising over 100K hours of data from more than 10,000 individuals. Furthermore, we present a unified pretraining objective that combines contrastive alignment, caption generation, and signal reconstruction to better capture physiological fidelity and cross-modal interactions. Extensive experiments on real-world sleep understanding tasks verify that SleepLM outperforms state-of-the-art in zero-shot and few-shot learning, cross-modal retrieval, and sleep captioning. Importantly, SleepLM also exhibits intriguing capabilities including language-guided event localization, targeted insight generation, and zero-shot generalization to unseen tasks. All code and data will be open-sourced.
Problem

Research questions and friction points this paper is trying to address.

sleep analysis
natural language
closed label space
sleep generalization
multimodal polysomnography
Innovation

Methods, ideas, or system contributions that make the work stand out.

sleep-language foundation model
multimodal polysomnography
cross-modal alignment
zero-shot generalization
sleep captioning
🔎 Similar Papers
No similar papers found.
Z
Zongzhe Xu
University of California, Los Angeles
Zitao Shuai
Zitao Shuai
UCLA; University of Michigan
E
Eideen Mozaffari
University of California, Los Angeles
R
Ravi S. Aysola
University of California, Los Angeles
R
Rajesh Kumar
University of California, Los Angeles
Yuzhe Yang
Yuzhe Yang
Assistant Professor, UCLA
Machine LearningArtificial IntelligenceHealthcareComputational Medicine