🤖 AI Summary
This study addresses a key limitation of existing multivariate functional principal component analysis (MFPCA) methods, which assume all functional observations share a common domain and thus struggle with real-world data exhibiting varying domains. To overcome this, the authors propose the first MFPCA framework tailored for variable-domain settings. Their approach first performs univariate FPCA separately for each variable over its individual domain, stacks the resulting scores, and then smooths the empirical covariance matrix according to domain lengths to estimate multivariate eigenfunctions and scores that properly account for heterogeneous observation intervals. By explicitly incorporating individual domain information, the method avoids biases introduced by conventional binning or by ignoring domain variability. Simulation studies demonstrate its superior performance over existing strategies, and the approach is successfully applied to analyze temperature and blood oxygen saturation trajectories in COVID-19 patients.
📝 Abstract
Multivariate functional principal component analysis (MFPCA) is a powerful dimension reduction technique for analyzing multiple functional variables simultaneously. However, existing MFPCA methods assume that all functional observations are recorded over a common, fixed domain. This assumption is often violated in practical applications where the observation period varies across subjects, leading to what is known as variable domain functional data. We propose a novel approach for MFPCA that explicitly accommodates variable domains by extending existing multivariate functional principal component analysis to the variable domain setting. Our methodology involves performing univariate variable domain FPCA for each functional variable separately, stacking the resulting univariate scores, and then smoothing the empirical covariance matrix of these stacked scores over the domain length. This allows us to estimate multivariate eigenfunctions and scores that properly account for varying observation periods. We demonstrate through extensive simulation studies that our proposed method outperforms approaches that ignore the variable domain structure and rely on binning strategies. The practical utility of our method is illustrated through an application analyzing body temperature and capillary oxygen saturation (SpO$_2$) trajectories in COVID-19 hospital admitted patients, where patients experienced varying lengths of stay and monitoring periods.