🤖 AI Summary
rPPG suffers from poor signal quality and insufficient robustness in skin segmentation under varying illumination, motion artifacts, and diverse skin tones. To address these challenges, this paper proposes a weighted whole-body skin segmentation method. Its key contributions are: (1) a priority-based skin region selection mechanism that dynamically assigns higher weights to high-SNR regions—such as the face and neck; (2) the construction of SYNC-rPPG, a real-world dataset encompassing broad skin tone variation, realistic lighting conditions, and natural motion (e.g., speaking, head turning); and (3) a video-driven end-to-end segmentation framework designed to enhance cross-skin-tone generalization. Experiments demonstrate that the proposed method significantly reduces mean absolute error (MAE) in heart rate estimation by 1.2–2.4 bpm across SYNC-rPPG and multiple public benchmarks, outperforming state-of-the-art approaches—particularly in complex, dynamic scenarios.
📝 Abstract
Remote photoplethysmography (rPPG) is an innovative method for monitoring heart rate and vital signs by using a simple camera to record a person, as long as any part of their skin is visible. This low-cost, contactless approach helps in remote patient monitoring, emotion analysis, smart vehicle utilization, and more. Over the years, various techniques have been proposed to improve the accuracy of this technology, especially given its sensitivity to lighting and movement. In the unsupervised pipeline, it is necessary to first select skin regions from the video to extract the rPPG signal from the skin color changes. We introduce a novel skin segmentation technique that prioritizes skin regions to enhance the quality of the extracted signal. It can detect areas of skin all over the body, making it more resistant to movement, while removing areas such as the mouth, eyes, and hair that may cause interference. Our model is evaluated on publicly available datasets, and we also present a new dataset, called SYNC-rPPG, to better represent real-world conditions. The results indicate that our model demonstrates a prior ability to capture heartbeats in challenging conditions, such as talking and head rotation, and maintain the mean absolute error (MAE) between predicted and actual heart rates, while other methods fail to do so. In addition, we demonstrate high accuracy in detecting a diverse range of skin tones, making this technique a promising option for real-world applications.