🤖 AI Summary
To address the poor robustness of remote photoplethysmography (rPPG) physiological monitoring under challenging real-world conditions—such as drastic illumination variations (e.g., high-altitude environments), frequent facial occlusions, and dynamic head poses—this paper proposes a non-contact, long-term health monitoring framework tailored for daily personal care scenarios (e.g., mirror-based grooming). Methodologically, we introduce LADH, the first long-duration, multimodal rPPG dataset featuring synchronized RGB+IR video streams alongside ground-truth PPG, respiration, and blood oxygen signals. We further propose the first RGB–IR cross-modal fusion strategy integrated with multi-task deep learning, coupled with a dynamic facial motion-robust feature extraction mechanism. Experiments demonstrate state-of-the-art performance: a mean absolute error of only 4.99 BPM in heart rate estimation, significantly outperforming existing methods under strong illumination interference, hand-induced occlusions, and large-pose variations. Both code and the LADH dataset are publicly released.
📝 Abstract
Remote photoplethysmography (rPPG) enables non-contact, continuous monitoring of physiological signals and offers a practical alternative to traditional health sensing methods. Although rPPG is promising for daily health monitoring, its application in long-term personal care scenarios, such as mirror-facing routines in high-altitude environments, remains challenging due to ambient lighting variations, frequent occlusions from hand movements, and dynamic facial postures. To address these challenges, we present LADH (Long-term Altitude Daily Health), the first long-term rPPG dataset containing 240 synchronized RGB and infrared (IR) facial videos from 21 participants across five common personal care scenarios, along with ground-truth PPG, respiration, and blood oxygen signals. Our experiments demonstrate that combining RGB and IR video inputs improves the accuracy and robustness of non-contact physiological monitoring, achieving a mean absolute error (MAE) of 4.99 BPM in heart rate estimation. Furthermore, we find that multi-task learning enhances performance across multiple physiological indicators simultaneously. Dataset and code are open at https://github.com/McJackTang/FusionVitals.