🤖 AI Summary
Existing self-supervised rPPG methods are highly susceptible to strong periodic confounders such as motion and illumination, leading to spurious correlations and poor generalization. This work proposes the Physiological Causal Probe (PCP) paradigm, which treats rPPG as a latent physical source and shifts from passive correlation learning to active causal validation. PCP achieves this by performing controllable interventions in the low-frequency chrominance domain of videos and incorporating two key mechanisms: falsifiable nullification and axiomatic equivariance. To ensure physically plausible interventions, PCP leverages a PhysMambaFormer-based hypothesis generator coupled with a physiological signal editor. Experiments demonstrate that PCP substantially improves both in-domain and cross-domain performance on challenging datasets such as VIPL-HR and MMPD—surpassing even supervised baselines—and exhibits remarkable robustness against motion and illumination artifacts.
📝 Abstract
Remote Photoplethysmography (rPPG) enables convenient non-contact physiological measurement. Existing Self-Supervised Learning (SSL) methods commonly fall into a correlation trap: they tend to learn the most dominant periodic signals in the data, such as high-energy motion or illumination noise, rather than the faint, true rPPG signal, leading to poor model generalization. To address this, we propose a new SSL paradigm, Physiological Causal Probing (PCP), which treats the latent rPPG signal as the underlying physical source and the resulting pixel chrominance variations as its visual manifestation. Its core idea is to shift from passive correlation learning to active, precise intervention: it intervenes on the video based on a proposed rPPG hypothesis, and verifies whether the post-intervention changes match physical expectations. We propose the Interv-rPPG framework to implement PCP: an rPPG extractor named PhysMambaFormer hypothesizes the rPPG signal, while a Controllable Physiological Signal Editor conducts precise chrominance-domain interventions on videos based on this hypothesis. Interv-rPPG validates the physical realism of the hypothesis through `Falsifiability via Nulling' and `Axiomatic Equivariance'. Our editor achieves precise editing of the rPPG signal by intervening in the low-frequency chrominance components of the video. Our method improves both in-domain and cross-domain performance on challenging datasets such as VIPL-HR and MMPD. Furthermore, it surpasses the supervised baseline in complex cross-dataset settings, while remaining competitive on clean datasets where the intervention mechanism may introduce slight residual chrominance noise. Extensive experiments, including diagnostic analysis of nuisance sensitivity, demonstrate that the PCP paradigm effectively resists motion and illumination artifacts.