MCI-GRU: Stock Prediction Model Based on Multi-Head Cross-Attention and Improved GRU

📅 2024-09-25

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

213K/year

🤖 AI Summary

Financial markets exhibit nonlinear dynamics, rigid historical information selection, sparse and noisy data, and unobservable latent states (e.g., investor sentiment and market microstructure), posing significant challenges for accurate stock price forecasting. To address these, we propose an end-to-end prediction model integrating multi-head cross-attention with an enhanced GRU. Specifically, we embed an attention mechanism into the GRU’s reset gate to enable adaptive selection of salient historical features; additionally, we design a multi-head cross-attention module that jointly models temporal and cross-sectional features to explicitly infer latent market states. Empirical evaluations across four major equity markets demonstrate consistent and substantial improvements over state-of-the-art methods. The model has been successfully deployed in a real-world fund investment research system, validating its high predictive accuracy, robustness to noise and sparsity, and practical engineering viability.

Technology Category

Application Category

📝 Abstract

As financial markets grow increasingly complex in the big data era, accurate stock prediction has become more critical. Traditional time series models, such as GRUs, have been widely used but often struggle to capture the intricate nonlinear dynamics of markets, particularly in the flexible selection and effective utilization of key historical information. Recently, methods like Graph Neural Networks and Reinforcement Learning have shown promise in stock prediction but require high data quality and quantity, and they tend to exhibit instability when dealing with data sparsity and noise. Moreover, the training and inference processes for these models are typically complex and computationally expensive, limiting their broad deployment in practical applications. Existing approaches also generally struggle to capture unobservable latent market states effectively, such as market sentiment and expectations, microstructural factors, and participant behavior patterns, leading to an inadequate understanding of market dynamics and subsequently impact prediction accuracy. To address these challenges, this paper proposes a stock prediction model, MCI-GRU, based on a multi-head cross-attention mechanism and an improved GRU. First, we enhance the GRU model by replacing the reset gate with an attention mechanism, thereby increasing the model's flexibility in selecting and utilizing historical information. Second, we design a multi-head cross-attention mechanism for learning unobservable latent market state representations, which are further enriched through interactions with both temporal features and cross-sectional features. Finally, extensive experiments on four main stock markets show that the proposed method outperforms SOTA techniques across multiple metrics. Additionally, its successful application in real-world fund management operations confirms its effectiveness and practicality.

Problem

Research questions and friction points this paper is trying to address.

Improves stock prediction by enhancing GRU with attention mechanisms

Captures unobservable market states like sentiment and microstructure

Addresses data sparsity and noise challenges in financial forecasting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Enhanced GRU with attention mechanism

Multi-head cross-attention for latent states

Improved stock prediction accuracy

🔎 Similar Papers

No similar papers found.