🤖 AI Summary
Financial markets exhibit nonlinear dynamics, rigid historical information selection, sparse and noisy data, and unobservable latent states (e.g., investor sentiment and market microstructure), posing significant challenges for accurate stock price forecasting. To address these, we propose an end-to-end prediction model integrating multi-head cross-attention with an enhanced GRU. Specifically, we embed an attention mechanism into the GRU’s reset gate to enable adaptive selection of salient historical features; additionally, we design a multi-head cross-attention module that jointly models temporal and cross-sectional features to explicitly infer latent market states. Empirical evaluations across four major equity markets demonstrate consistent and substantial improvements over state-of-the-art methods. The model has been successfully deployed in a real-world fund investment research system, validating its high predictive accuracy, robustness to noise and sparsity, and practical engineering viability.
📝 Abstract
As financial markets grow increasingly complex in the big data era, accurate stock prediction has become more critical. Traditional time series models, such as GRUs, have been widely used but often struggle to capture the intricate nonlinear dynamics of markets, particularly in the flexible selection and effective utilization of key historical information. Recently, methods like Graph Neural Networks and Reinforcement Learning have shown promise in stock prediction but require high data quality and quantity, and they tend to exhibit instability when dealing with data sparsity and noise. Moreover, the training and inference processes for these models are typically complex and computationally expensive, limiting their broad deployment in practical applications. Existing approaches also generally struggle to capture unobservable latent market states effectively, such as market sentiment and expectations, microstructural factors, and participant behavior patterns, leading to an inadequate understanding of market dynamics and subsequently impact prediction accuracy. To address these challenges, this paper proposes a stock prediction model, MCI-GRU, based on a multi-head cross-attention mechanism and an improved GRU. First, we enhance the GRU model by replacing the reset gate with an attention mechanism, thereby increasing the model's flexibility in selecting and utilizing historical information. Second, we design a multi-head cross-attention mechanism for learning unobservable latent market state representations, which are further enriched through interactions with both temporal features and cross-sectional features. Finally, extensive experiments on four main stock markets show that the proposed method outperforms SOTA techniques across multiple metrics. Additionally, its successful application in real-world fund management operations confirms its effectiveness and practicality.