Capturing the Curve: Functional Data Analysis for Validated Digital Outcome Measures

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

Traditional clinical trials rely on predefined scalar outcome measures, which fail to fully leverage the rich information embedded in high-dimensional, multi-level physiological signals. This work proposes a data-driven approach based on multilevel functional principal component analysis (MFPCA) to extract individual projection scores across hierarchical levels from a reference population, constructing low-dimensional functional summaries that capture deviations from the primary directions of population variability. For the first time, MFPCA-derived inter-individual score systems are applied to validate digital health outcome measures, transcending conventional paradigms. In simulated ECG data, the method demonstrates high test–retest reliability and strong discriminative power between groups; in real-world knee movement data from Parkinson’s disease patients, it exhibits superior correlation with gold-standard assessments and greater sensitivity to clinically meaningful changes.

📝 Abstract

Digital health technologies enable high-frequency collection of data in near-continuous time and capture rich information about the health of individuals. The raw data collected by these devices often have a hierarchical functional structure: repeated physiological functions are observed over time and on multiple time scales (seconds, days, weeks). While many summaries can be derived from digital data, typically, only a small subset of pre-defined scalars is validated as outcome measures in clinical trials. We explore data-driven summaries based on between-subject scores from Multilevel Functional Principal Component Analysis (MFPCA), which are low-dimensional representations of functional data with robust statistical properties. Specifically, we compute MFPCA projection scores with respect to a reference population, summarising how individuals differ from the dominant directions of variation at each hierarchical level. Through a simulation study based on smartwatch electrocardiogram (ECG) signals, we compare MFPCA scores with pre-specified summaries in terms of validation criteria, including test-retest reliability and known-groups discrimination. We demonstrate that MFPCA scores generally have high reliability and can discriminate between groups across simulated scenarios of change. This offers an advantage when digital tools enable the measurement of novel physiological signals and the characteristics of the change are not yet defined. Finally, using knee flexion-extension data from individuals living with Parkinson's disease, we demonstrate that one of the MFPCA scores more strongly correlates with established gold-standard metrics and can detect clinical change, compared to a pre-specified scalar. We conclude that MFPCA-derived scores retain more information than typical outcome measures and open the door to using learning representation strategies in clinical trial settings.

Problem

Research questions and friction points this paper is trying to address.

digital outcome measures

functional data analysis

clinical validation

multilevel functional principal component analysis

physiological signals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilevel Functional Principal Component Analysis

digital outcome measures

functional data analysis