Mean Square Errors of factors extracted using principal components, linear projections, and Kalman filter

📅 2026-01-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical role of uncertainty quantification in factor extraction for high-dimensional systems, where accurate interpretability and information aggregation hinge on reliable estimation. It presents the first systematic comparison of Principal Component (PC) analysis and Kalman Filtering (KF) in terms of mean squared error (MSE) for factor estimation under finite samples, explicitly examining how misspecification of the cross-sectional correlation structure of idiosyncratic components—such as assuming homoskedasticity or zero correlation—affects estimation accuracy. Leveraging linear projection theory and Monte Carlo simulations, the work demonstrates that treating true factors as random variables rather than fixed significantly reduces MSE. Across multiple settings, KF consistently outperforms PC, yielding more precise factor estimates. These findings establish a theoretical foundation for constructing factor confidence intervals, with simulation results confirming their empirical validity.

Technology Category

Application Category

📝 Abstract
Factor extraction from systems of variables with a large cross-sectional dimension, $N$, is often based on either Principal Components (PC)-based procedures, or Kalman filter (KF)-based procedures. Measuring the uncertainty of the extracted factors is important when, for example, they have a direct interpretation and/or they are used to summarized the information in a large number of potential predictors. In this paper, we compare the finite $N$ mean square errors (MSEs) of PC and KF factors extracted under different structures of the idiosyncratic cross-correlations. We show that the MSEs of PC-based factors, implicitly based on treating the true underlying factors as deterministic, are larger than the corresponding MSEs of KF factors, obtained by treating the true factors as either serially independent or autocorrelated random variables. We also study and compare the MSEs of PC and KF factors estimated when the idiosyncratic components are wrongly considered as if they were cross-sectionally homoscedastic and/or uncorrelated. The relevance of the results for the construction of confidence intervals for the factors are illustrated with simulated data.
Problem

Research questions and friction points this paper is trying to address.

factor extraction
mean square error
principal components
Kalman filter
cross-sectional correlation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mean Square Error
Principal Components
Kalman Filter
Factor Extraction
Idiosyncratic Correlation
Matteo Barigozzi
Matteo Barigozzi
Full Professor - Alma Mater Studiorum Università di Bologna
Time Series Analysis - High dimensional data - Factor models - Networks
D
Diego Fresoli
Department of Economic Analysis-Quantitative Economics, Universidad Autonoma de Madrid (Spain)
E
Esther Ruiz
Department of Statistics, Universidad Carlos III de Madrid (Spain)