๐ค AI Summary
Online functional principal component analysis (FPCA) for high-dimensional functional data streams remains challenging due to computational intractability and lack of scalable, theoretically grounded frameworks on non-Euclidean domains.
Method: This paper introduces the first scalable online FPCA framework operating directly on the Stiefel manifold. It combines tensor-product spline bases with penalty-based smoothness constraints, designs a Riemannian stochastic gradient descent algorithm, and incorporates adaptive momentum estimation and iterate averaging. A novel dynamic hyperparameter tuning strategy based on rolling-block validation is further proposed.
Contribution/Results: Experiments on synthetic and real-world spatiotemporal datasets demonstrate that the method achieves 10โ100ร speedup over state-of-the-art baselines while improving reconstruction accuracy by 35%โ52% (mean error reduction). It enables real-time dimensionality reduction and feature extraction, establishing a new paradigm for large-scale functional data stream analysisโrigorous in theory and viable in practice.
๐ Abstract
Multidimensional functional data streams arise in diverse scientific fields, yet their analysis poses significant challenges. We propose a novel online framework for functional principal component analysis that enables efficient and scalable modeling of such data. Our method represents functional principal components using tensor product splines, enforcing smoothness and orthonormality through a penalized framework on a Stiefel manifold. An efficient Riemannian stochastic gradient descent algorithm is developed, with extensions inspired by adaptive moment estimation and averaging techniques to accelerate convergence. Additionally, a dynamic tuning strategy for smoothing parameter selection is developed based on a rolling averaged block validation score that adapts to the streaming nature of the data. Extensive simulations and real-world applications demonstrate the flexibility and effectiveness of this framework for analyzing multidimensional functional data.