PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning

📅 2026-02-10

📈 Citations: 1

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Existing PPG datasets predominantly rely on numerical or task-specific labels, limiting their compatibility with language models. To bridge this gap, this work introduces the first large-scale multimodal physiological signal–language reasoning benchmark by aligning raw PPG waveforms with natural language in a question-answering format. The authors integrate 16 publicly available PPG sources spanning 12 downstream tasks and employ data aggregation and annotation alignment techniques to generate over one million PPG segments paired with nearly 2.5 million question-answer pairs. They further release a standardized dataset, reproducible training and evaluation protocols, and a multimodal large language model-based baseline method for PPG-aware reasoning, establishing a foundation for future research at the intersection of physiological signal processing and natural language understanding.

📝 Abstract

Photoplethysmography (PPG) is a widely used non-invasive sensing modality for continuous cardiovascular and physiological monitoring across clinical, laboratory, and wearable settings. While existing PPG datasets support a broad range of downstream tasks, they typically provide supervision in the form of numerical measurements or task-specific labels, limiting their compatibility with language-based interfaces and multimodal foundation models. In this work, we introduce PulseLM, a large-scale PPG-text question-answering dataset that bridges raw PPG waveforms and natural language through a unified question-answering (QA) formulation. PulseLM aggregates PPG recordings from sixteen publicly available sources and harmonizes heterogeneous annotations into 12 downstream tasks. The dataset comprises over 1 million standardized 10-second PPG segments, associated with nearly 2.5 million question-answer pairs. We further define reproducible data pipeline, training, and evaluation protocols and establish baseline benchmarks using multimodal PPG-aware large language models. PulseLM provides a standardized foundation for studying language-grounded physiological inference, cross-dataset generalization, and scalable benchmarking of PPG-based multimodal models. We publicly release the dataset and code at https://huggingface.co/datasets/Manhph2211/PulseLM and https://github.com/manhph2211/PULSE-LM, respectively.

Problem

Research questions and friction points this paper is trying to address.

PPG-text learning

foundation dataset

multimodal foundation models

language-grounded physiological inference

photoplethysmography

Innovation

Methods, ideas, or system contributions that make the work stand out.

PPG-text learning

multimodal foundation model

question-answering dataset