Wavelet-Driven Masked Multiscale Reconstruction for PPG Foundation Models

📅 2026-01-18

📈 Citations: 3

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the limitation of existing PPG foundation models, which overlook the multi-band spectral structure of photoplethysmographic signals during pretraining and consequently struggle to effectively capture multiscale physiological features ranging from fine-grained waveforms to global rhythms. To overcome this, we propose a Masked Multi-scale Reconstruction (MMR) framework that, for the first time, integrates wavelet-driven multi-resolution time–frequency representations into self-supervised PPG learning. Specifically, the input signal is decomposed via wavelet transform, and randomly masked wavelet coefficients are reconstructed within a Transformer encoder, thereby explicitly fusing multiscale time–frequency information. Evaluated across 19 health-related tasks, our method matches or surpasses current state-of-the-art open-source PPG and general-purpose time-series foundation models on 17 tasks, significantly enhancing the physiological interpretability, generalization, and robustness of learned representations.

Technology Category

Application Category

📝 Abstract

Wearable foundation models have the potential to transform digital health by learning transferable representations from large-scale biosignals collected in everyday settings. While recent progress has been made in large-scale pretraining, most approaches overlook the spectral structure of photoplethysmography (PPG) signals, wherein physiological rhythms unfold across multiple frequency bands. Motivated by the insight that many downstream health-related tasks depend on multi-resolution features spanning fine-grained waveform morphology to global rhythmic dynamics, we introduce Masked Multiscale Reconstruction (MMR) for PPG representation learning - a self-supervised pretraining framework that explicitly learns from hierarchical time-frequency scales of PPG data. The pretraining task is designed to reconstruct randomly masked out coefficients obtained from a wavelet-based multiresolution decomposition of PPG signals, forcing the transformer encoder to integrate information across temporal and spectral scales. We pretrain our model with MMR using ~17 million unlabeled 10-second PPG segments from ~32,000 smartwatch users. On 17 of 19 diverse health-related tasks, MMR trained on large-scale wearable PPG data improves over or matches state-of-the-art open-source PPG foundation models, time-series foundation models, and other self-supervised baselines. Extensive analysis of our learned embeddings and systematic ablations underscores the value of wavelet-based representations, showing that they capture robust and physiologically-grounded features. Together, these results highlight the potential of MMR as a step toward generalizable PPG foundation models.

Problem

Research questions and friction points this paper is trying to address.

photoplethysmography

multiscale representation

spectral structure

foundation models

physiological rhythms

Innovation

Methods, ideas, or system contributions that make the work stand out.

Masked Multiscale Reconstruction

Wavelet Transform

PPG Foundation Model