Model Form Identification in High-Dimensional Functional Linear Regressions

📅 2026-05-06

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

This study addresses the dual challenges of variable selection and functional form identification in high-dimensional functional linear regression, where predictors are infinite-dimensional and ultra-high-dimensional, making it difficult to balance accuracy and interpretability. The authors propose MoFI-FLR, a two-stage framework that first employs functional elastic net to screen active variables and then decomposes the functional coefficients of selected variables into a finite-dimensional interpretable principal component and an infinite-dimensional complementary component, applying regularization only to the latter to distinguish simple from complex effects. This approach is the first to automatically identify the functional forms of predictors in high-dimensional functional regression while simultaneously performing variable selection and effect-type discrimination. Leveraging RKHS theory, a novel coefficient decomposition, and an efficient optimization algorithm, the method enjoys non-asymptotic theoretical guarantees for consistent active set recovery and accurate functional form identification, with simulations and EEG data analysis demonstrating its superior computational efficiency and interpretability.

📝 Abstract

High-dimensional functional data are becoming increasingly common in fields such as environmental monitoring and neuroimaging. This paper studies high-dimensional functional linear regression models that relate a scalar response to ultra-high-dimensional functional predictors, where each predictor is treated as a random element in an infinite-dimensional functional space. To address the dual challenges of high-dimensionality and model interpretability, we propose MoFI-FLR, a novel two-step estimation framework rooted in reproducing kernel Hilbert space (RKHS) theory. The first step employs a functional elastic-net penalty to screen out irrelevant covariates, while the second step decomposes each selected predictor's functional coefficient into an interpretable finite-dimensional simple component and an infinite-dimensional complementary complement. By penalizing only the complementary component, our method automatically distinguishes simple effects, which consist only of the simple component, from complex effects, which also include complementary deviations. Under mild regularity conditions, we establish non-asymptotic theoretical guarantees, demonstrating that MoFI-FLR consistently recovers the active covariates and accurately identifies their true functional forms. We develop a computationally efficient algorithm to implement the proposed method and evaluate its performance through comprehensive simulation studies and an application to Psychomotor Vigilance Task EEG data.

Problem

Research questions and friction points this paper is trying to address.

high-dimensional functional data

functional linear regression

model form identification

interpretability

ultra-high-dimensional predictors

Innovation

Methods, ideas, or system contributions that make the work stand out.

functional linear regression

high-dimensional data

model form identification