🤖 AI Summary
This study addresses the challenge of accurately estimating the number of latent factors in high-dimensional factor models when data are missing—a problem that often undermines existing methods. The authors introduce the notion of “identifiable factors” and propose a Missingness-Adaptive Threshold Estimator (MATE), establishing the first adaptive framework for factor number selection that operates without imputation, does not rely on strong factor assumptions, and accommodates both homogeneous and heterogeneous missingness mechanisms. By integrating high-dimensional statistical inference with factor model theory, the proposed approach achieves consistency and robustness even under high missing rates and weak factor signals, significantly outperforming current state-of-the-art estimators.
📝 Abstract
Determining the number of factors in high-dimensional factor models remains a fundamental challenge, particularly when data are incomplete. This paper introduces the concept of identifiable factors, those that can be reliably recovered despite missing observations, and proposes the Missingness-Adaptive Thresholding Estimator (MATE). To our knowledge, MATE is the first missingness-adaptive framework for factor number determination that accommodates both homogeneous and heterogeneous missingness without imposing restrictive assumptions on factor strength. Notably, it operates without data imputation, circumventing the computational burden associated with most existing approaches. We establish a rigorous theoretical foundation for MATE, proving its consistency under a range of structural conditions. Extensive simulations and real-world applications demonstrate that MATE consistently outperforms state-of-the-art methods, exhibiting superior robustness in settings with high missingness rates and weak factor signals.