PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning

๐Ÿ“… 2026-02-10
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF

career value

191K/year
๐Ÿค– AI Summary
Existing PPG datasets predominantly rely on numerical or task-specific labels, limiting their compatibility with language models. To bridge this gap, this work introduces the first large-scale multimodal physiological signalโ€“language reasoning benchmark by aligning raw PPG waveforms with natural language in a question-answering format. The authors integrate 16 publicly available PPG sources spanning 12 downstream tasks and employ data aggregation and annotation alignment techniques to generate over one million PPG segments paired with nearly 2.5 million question-answer pairs. They further release a standardized dataset, reproducible training and evaluation protocols, and a multimodal large language model-based baseline method for PPG-aware reasoning, establishing a foundation for future research at the intersection of physiological signal processing and natural language understanding.
๐Ÿ“ Abstract
Photoplethysmography (PPG) is a widely used non-invasive sensing modality for continuous cardiovascular and physiological monitoring across clinical, laboratory, and wearable settings. While existing PPG datasets support a broad range of downstream tasks, they typically provide supervision in the form of numerical measurements or task-specific labels, limiting their compatibility with language-based interfaces and multimodal foundation models. In this work, we introduce PulseLM, a large-scale PPG-text question-answering dataset that bridges raw PPG waveforms and natural language through a unified question-answering (QA) formulation. PulseLM aggregates PPG recordings from sixteen publicly available sources and harmonizes heterogeneous annotations into 12 downstream tasks. The dataset comprises over 1 million standardized 10-second PPG segments, associated with nearly 2.5 million question-answer pairs. We further define reproducible data pipeline, training, and evaluation protocols and establish baseline benchmarks using multimodal PPG-aware large language models. PulseLM provides a standardized foundation for studying language-grounded physiological inference, cross-dataset generalization, and scalable benchmarking of PPG-based multimodal models. We publicly release the dataset and code at https://huggingface.co/datasets/Manhph2211/PulseLM and https://github.com/manhph2211/PULSE-LM, respectively.
Problem

Research questions and friction points this paper is trying to address.

PPG-text learning
foundation dataset
multimodal foundation models
language-grounded physiological inference
photoplethysmography
Innovation

Methods, ideas, or system contributions that make the work stand out.

PPG-text learning
multimodal foundation model
question-answering dataset
physiological signal understanding
large language model
๐Ÿ”Ž Similar Papers
No similar papers found.