🤖 AI Summary
Obstructive sleep apnea (OSA) remains underdiagnosed due to the high cost and complexity of polysomnography (PSG). Current audio-based screening methods are hindered by environmental noise interference and lack access to critical physiological signals—particularly respiratory effort—while conventional respiratory effort monitoring relies on contact sensors, limiting scalability and patient adherence. This work introduces the first non-invasive method to estimate full-night respiratory effort signals from home-recorded audio. We propose a latent-space fusion framework that jointly models learned respiratory effort representations and acoustic features. Evaluated on 157 overnight audio recordings, our respiratory effort estimation achieves an intraclass correlation coefficient (ICC) of 0.48. The fusion model significantly outperforms pure-audio baselines, improving sensitivity (+9.2%) and AUC (+0.06), especially at low apnea–hypopnea index (AHI) thresholds (<5). This approach establishes a scalable, contactless, physiology-aware paradigm for smartphone-based remote OSA screening.
📝 Abstract
Obstructive sleep apnoea (OSA) is a prevalent condition with significant health consequences, yet many patients remain undiagnosed due to the complexity and cost of over-night polysomnography. Acoustic-based screening provides a scalable alternative, yet performance is limited by environmental noise and the lack of physiological context. Respiratory effort is a key signal used in clinical scoring of OSA events, but current approaches require additional contact sensors that reduce scalability and patient comfort. This paper presents the first study to estimate respiratory effort directly from nocturnal audio, enabling physiological context to be recovered from sound alone. We propose a latent-space fusion framework that integrates the estimated effort embeddings with acoustic features for OSA detection. Using a dataset of 157 nights from 103 participants recorded in home environments, our respiratory effort estimator achieves a concordance correlation coefficient of 0.48, capturing meaningful respiratory dynamics. Fusing effort and audio improves sensitivity and AUC over audio-only baselines, especially at low apnoea-hypopnoea index thresholds. The proposed approach requires only smartphone audio at test time, which enables sensor-free, scalable, and longitudinal OSA monitoring.