🤖 AI Summary
This study addresses three critical limitations of wrist-worn accelerometry for sleep–wake classification: poor cross-device generalizability, inaccurate wake detection, and insufficient robustness validation across diverse age groups and clinical populations (e.g., insomnia, obstructive sleep apnea). We propose a universal deep learning framework leveraging triaxial accelerometer data. Using 30-second sliding windows, we extract time–frequency features and develop a hybrid three-class model—integrating convolutional neural networks with decision trees—to distinguish sleep, wake, and wakefulness during sleep (WDS). Validated across multi-device, multicenter cohorts including heterogeneous patient populations, the model achieves an overall F1-score of 0.86, sleep sensitivity of 0.87, specificity of 0.78, and strong correlation with polysomnography-derived total sleep time (R = 0.69), with stable cross-device performance. To our knowledge, this is the first wrist-based sleep staging model explicitly designed for clinical diversity, demonstrating high robustness and transferability.
📝 Abstract
Study Objectives: Wrist accelerometry is widely used for inferring sleep-wake state. Previous works demonstrated poor wake detection, without cross-device generalizability and validation in different age range and sleep disorders. We developed a robust deep learning model for to detect sleep-wakefulness from triaxial accelerometry and evaluated its validity across three devices and in a large adult population spanning a wide range of ages with and without sleep disorders. Methods: We collected wrist accelerometry simultaneous to polysomnography (PSG) in 453 adults undergoing clinical sleep testing at a tertiary care sleep laboratory, using three devices. We extracted features in 30-second epochs and trained a 3-class model to detect wake, sleep, and sleep with arousals, which was then collapsed into wake vs. sleep using a decision tree. To enhance wake detection, the model was specifically trained on randomly selected subjects with low sleep efficiency and/or high arousal index from one device recording and then tested on the remaining recordings. Results: The model showed high performance with F1 Score of 0.86, sensitivity (sleep) of 0.87, and specificity (wakefulness) of 0.78, and significant and moderate correlation to PSG in predicting total sleep time (R=0.69) and sleep efficiency (R=0.63). Model performance was robust to the presence of sleep disorders, including sleep apnea and periodic limb movements in sleep, and was consistent across all three models of accelerometer. Conclusions: We present a deep model to detect sleep-wakefulness from actigraphy in adults with relative robustness to the presence of sleep disorders and generalizability across diverse commonly used wrist accelerometers.