🤖 AI Summary
Traditional PCA suffers from estimation inconsistency in high-dimensional factor models under weak factors or complex noise dependence. This paper proposes a weighted PCA method: it constructs a theory-driven weight matrix to form a weighted linear combination of the observed variables’ autocovariance matrices, thereby correcting estimation bias; it further designs a data-driven adaptive mechanism to select the optimal weight from a candidate set. The proposed general framework ensures consistent and asymptotically normal estimation of both factor loadings and common factors under substantially weaker identification conditions. Theoretical analysis and numerical experiments demonstrate that the method significantly outperforms conventional PCA and state-of-the-art alternatives in challenging settings—including weak factors, heterogeneous noise, and strong serial correlation—while maintaining robustness and computational feasibility.
📝 Abstract
Principal component analysis (PCA) is arguably the most widely used approach for large-dimensional factor analysis. While it is effective when the factors are sufficiently strong, it can be inconsistent when the factors are weak and/or the noise has complex dependence structure. We argue that the inconsistency often stems from bias and introduce a general approach to restore consistency. Specifically, we propose a general weighting scheme for PCA and show that with a suitable choice of weighting matrices, it is possible to deduce consistent and asymptotic normal estimators under much weaker conditions than the usual PCA. While the optimal weight matrix may require knowledge about the factors and covariance of the idiosyncratic noise that are not known a priori, we develop an agnostic approach to adaptively choose from a large class of weighting matrices that can be viewed as PCA for weighted linear combinations of auto-covariances among the observations. Theoretical and numerical results demonstrate the merits of our methodology over the usual PCA and other recently developed techniques for large-dimensional approximate factor models.