🤖 AI Summary
This paper addresses dynamic modeling of response variables under high-dimensional covariates. Methodologically, it proposes a nonparametric, data-driven dynamic factor model that pioneers the integration of anisotropic diffusion maps into dynamic factor modeling—combining graph Laplacian embedding, linearized diffusion approximation, and Kalman filtering to jointly characterize nonlinear, time-varying relationships between covariates and responses in a low-dimensional manifold space. Theoretically, it extends Singer’s convergence analysis to Euclidean-space Langevin diffusion time series, establishing convergence guarantees for both the linear diffusion approximation and ergodic averaging. Empirically, the method is applied to equity portfolio stress testing under Federal Reserve supervision; historical backtesting shows average absolute error reductions of up to 55% versus conventional scenario analysis and 39% versus PCA, markedly improving predictive accuracy and robustness.
📝 Abstract
We propose a data-driven dynamic factor framework where a response variable depends on a high-dimensional set of covariates, without imposing any parametric model on the joint dynamics. Leveraging Anisotropic Diffusion Maps, a nonlinear manifold learning technique introduced by Singer and Coifman, our framework uncovers the joint dynamics of the covariates and responses in a purely data-driven way. We approximate the embedding dynamics using linear diffusions, and exploit Kalman filtering to predict the evolution of the covariates and response variables directly from the diffusion map embedding space. We generalize Singer's convergence rate analysis of the graph Laplacian from the case of independent uniform samples on a compact manifold to the case of time series arising from Langevin diffusions in Euclidean space. Furthermore, we provide rigorous justification for our procedure by showing the robustness of approximations of the diffusion map coordinates by linear diffusions, and the convergence of ergodic averages under standard spectral assumptions on the underlying dynamics. We apply our method to the stress testing of equity portfolios using a combination of financial and macroeconomic factors from the Federal Reserve's supervisory scenarios. We demonstrate that our data-driven stress testing method outperforms standard scenario analysis and Principal Component Analysis benchmarks through historical backtests spanning three major financial crises, achieving reductions in mean absolute error of up to 55% and 39% for scenario-based portfolio return prediction, respectively.