Missingness-Adaptive Factor Identification in High-Dimensional Data

📅 2026-04-20

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This study addresses the challenge of accurately estimating the number of latent factors in high-dimensional factor models when data are missing—a problem that often undermines existing methods. The authors introduce the notion of “identifiable factors” and propose a Missingness-Adaptive Threshold Estimator (MATE), establishing the first adaptive framework for factor number selection that operates without imputation, does not rely on strong factor assumptions, and accommodates both homogeneous and heterogeneous missingness mechanisms. By integrating high-dimensional statistical inference with factor model theory, the proposed approach achieves consistency and robustness even under high missing rates and weak factor signals, significantly outperforming current state-of-the-art estimators.

Technology Category

Application Category

📝 Abstract

Determining the number of factors in high-dimensional factor models remains a fundamental challenge, particularly when data are incomplete. This paper introduces the concept of identifiable factors, those that can be reliably recovered despite missing observations, and proposes the Missingness-Adaptive Thresholding Estimator (MATE). To our knowledge, MATE is the first missingness-adaptive framework for factor number determination that accommodates both homogeneous and heterogeneous missingness without imposing restrictive assumptions on factor strength. Notably, it operates without data imputation, circumventing the computational burden associated with most existing approaches. We establish a rigorous theoretical foundation for MATE, proving its consistency under a range of structural conditions. Extensive simulations and real-world applications demonstrate that MATE consistently outperforms state-of-the-art methods, exhibiting superior robustness in settings with high missingness rates and weak factor signals.

Problem

Research questions and friction points this paper is trying to address.

factor number determination

high-dimensional data

missing data

identifiable factors

factor models

Innovation

Methods, ideas, or system contributions that make the work stand out.

missingness-adaptive

factor identification

high-dimensional data