Fast Penalized Generalized Estimating Equations for Large Longitudinal Functional Datasets

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing functional regression methods struggle to efficiently analyze large-scale longitudinal binary or count functional data (e.g., calcium imaging time series in neuroscience). This paper proposes a fast, robust one-step penalized estimating equation approach that unifies continuous, count, and binary responses while accommodating both functional and scalar covariates. We introduce adaptive one-step M-estimation theory, ensuring asymptotically valid confidence intervals for regression coefficients even under misspecification of the working correlation structure—achieving efficiency comparable to fully iterative estimators. Integrating penalized generalized estimating equations with longitudinal functional data analysis techniques, our framework enables automatic smoothing parameter selection and joint confidence interval construction. On binary functional data comprising 150,000 curves, each with 120 time points, the method completes estimation in just 13.5 minutes, uncovering time-dynamic effects overlooked by conventional approaches. An open-source R package, fastFGEE, is released to facilitate implementation.

Technology Category

Application Category

📝 Abstract
Longitudinal binary or count functional data are common in neuroscience, but are often too large to analyze with existing functional regression methods. We propose one-step penalized generalized estimating equations that supports continuous, count, or binary functional outcomes and is fast even when datasets have a large number of clusters and large cluster sizes. The method applies to both functional and scalar covariates, and the one-step estimation framework enables efficient smoothing parameter selection, bootstrapping, and joint confidence interval construction. Importantly, this semi-parametric approach yields coefficient confidence intervals that are provably valid asymptotically even under working correlation misspecification. By developing a general theory for adaptive one-step M-estimation, we prove that the coefficient estimates are asymptotically normal and as efficient as the fully-iterated estimator; we verify these theoretical properties in extensive simulations. Finally, we apply our method to a calcium imaging dataset published in Nature, and show that it reveals important timing effects obscured in previous non-functional analyses. In doing so, we demonstrate scaling to common neuroscience dataset sizes: the one-step estimator fits to a dataset with 150,000 (binary) functional outcomes, each observed at 120 functional domain points, in only 13.5 minutes on a laptop without parallelization. We release our implementation in the 'fastFGEE' package.
Problem

Research questions and friction points this paper is trying to address.

Handles large longitudinal binary or count functional data efficiently
Supports functional and scalar covariates with valid confidence intervals
Scales to neuroscience datasets with high-dimensional outcomes quickly
Innovation

Methods, ideas, or system contributions that make the work stand out.

One-step penalized generalized estimating equations
Supports continuous, count, binary outcomes
Efficient smoothing parameter selection
🔎 Similar Papers
No similar papers found.