🤖 AI Summary
Existing self-supervised learning methods for physiological time series suffer from heuristic design and weakly constrained pretext tasks, failing to simultaneously preserve physiologically meaningful state information and suppress subject-specific noise. To address this, we propose PULSE—a novel framework that introduces dynamical system modeling into self-supervised learning for the first time. PULSE employs cross-sample reconstruction to explicitly identify shared system parameters (i.e., class-identifying dynamics), thereby disentangling transferable physiological mechanisms from individual-specific artifacts. We theoretically derive sufficient conditions for identifiability and recoverability of these system parameters. Extensive experiments on synthetic dynamical systems and real-world physiological signals—including ECG and PPG—demonstrate that PULSE significantly improves semantic discriminability, label efficiency, and cross-task transfer performance. By grounding representation learning in interpretable, structured dynamical priors, PULSE establishes a new paradigm for physiology-aware signal representation.
📝 Abstract
The effectiveness of self-supervised learning (SSL) for physiological time series depends on the ability of a pretraining objective to preserve information about the underlying physiological state while filtering out unrelated noise. However, existing strategies are limited due to reliance on heuristic principles or poorly constrained generative tasks. To address this limitation, we propose a pretraining framework that exploits the information structure of a dynamical systems generative model across multiple time-series. This framework reveals our key insight that class identity can be efficiently captured by extracting information about the generative variables related to the system parameters shared across similar time series samples, while noise unique to individual samples should be discarded. Building on this insight, we propose PULSE, a cross-reconstruction-based pretraining objective for physiological time series datasets that explicitly extracts system information while discarding non-transferrable sample-specific ones. We establish theory that provides sufficient conditions for the system information to be recovered, and empirically validate it using a synthetic dynamical systems experiment. Furthermore, we apply our method to diverse real-world datasets, demonstrating that PULSE learns representations that can broadly distinguish semantic classes, increase label efficiency, and improve transfer learning.