🤖 AI Summary
This study addresses the challenges in data-driven latent factor discovery for stock return prediction—namely, weak predictive signals, strong noise interference, and difficulty disentangling effective latent structures from handcrafted prior factors. To this end, we propose a Cascaded Residual Hypergraph Neural Network integrated with Temporal Residual Contrastive Learning. Our method models residuals of prior factors, leverages hypergraphs to capture high-order market interdependencies, employs residual mechanisms to suppress noise, and introduces temporal contrastive learning to enhance temporal consistency in latent factor representations. Evaluated on real-world A-share market data, the model significantly outperforms existing state-of-the-art methods. The discovered latent factors exhibit both economic interpretability and robust out-of-sample forecasting performance across time horizons. This work establishes a novel, interpretable, and reusable paradigm for data-driven factor discovery in quantitative finance.
📝 Abstract
As a fundamental method in economics and finance, the factor model has been extensively utilized in quantitative investment. In recent years, there has been a paradigm shift from traditional linear models with expert-designed factors to more flexible nonlinear machine learning-based models with data-driven factors, aiming to enhance the effectiveness of these factor models. However, due to the low signal-to-noise ratio in market data, mining effective factors in data-driven models remains challenging. In this work, we propose a hypergraph-based factor model with temporal residual contrastive learning (FactorGCL) that employs a hypergraph structure to better capture high-order nonlinear relationships among stock returns and factors. To mine hidden factors that supplement human-designed prior factors for predicting stock returns, we design a cascading residual hypergraph architecture, in which the hidden factors are extracted from the residual information after removing the influence of prior factors. Additionally, we propose a temporal residual contrastive learning method to guide the extraction of effective and comprehensive hidden factors by contrasting stock-specific residual information over different time periods. Our extensive experiments on real stock market data demonstrate that FactorGCL not only outperforms existing state-of-the-art methods but also mines effective hidden factors for predicting stock returns.