Dynamic stacking ensemble learning with investor knowledge representations for stock market index prediction based on multi-source financial data

📅 2025-12-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses two key challenges in financial forecasting: (1) the difficulty of fusing heterogeneous multi-source data (e.g., global/sectoral indices and financial news), and (2) insufficient modeling of investor cognitive heterogeneity. To tackle these, we propose a two-stage dynamic stacking ensemble framework. In Stage I, investor knowledge representation is leveraged to extract discriminative multimodal features via an LSTM-CNN-BERT fusion architecture. In Stage II, a cognition-aware data attribute identification mechanism and a time-window-adaptive meta-classifier online switching strategy are introduced. Our key innovations include embedding investor cognition into the dynamic stacking architecture and designing a prediction-confidence-driven meta-model selection mechanism. Empirical evaluation on intraday direction prediction for the Shanghai Composite Index, Shenzhen Component Index, and ChiNext Index demonstrates accuracy improvements of 1.42%, 7.94%, and 7.73%, respectively. Live trading simulations further confirm substantial gains in cumulative returns and Sharpe ratio.

Technology Category

Application Category

📝 Abstract
The patterns of different financial data sources vary substantially, and accordingly, investors exhibit heterogeneous cognition behavior in information processing. To capture different patterns, we propose a novel approach called the two-stage dynamic stacking ensemble model based on investor knowledge representations, which aims to effectively extract and integrate the features from multi-source financial data. In the first stage, we identify different financial data property from global stock market indices, industrial indices, and financial news based on the perspective of investors. And then, we design appropriate neural network architectures tailored to these properties to generate effective feature representations. Based on learned feature representations, we design multiple meta-classifiers and dynamically select the optimal one for each time window, enabling the model to effectively capture and learn the distinct patterns that emerge across different temporal periods. To evaluate the performance of the proposed model, we apply it to predicting the daily movement of Shanghai Securities Composite index, SZSE Component index and Growth Enterprise index in Chinese stock market. The experimental results demonstrate the effectiveness of our model in improving the prediction performance. In terms of accuracy metric, our approach outperforms the best competing models by 1.42%, 7.94%, and 7.73% on the SSEC, SZEC, and GEI indices, respectively. In addition, we design a trading strategy based on the proposed model. The economic results show that compared to the competing trading strategies, our strategy delivers a superior performance in terms of the accumulated return and Sharpe ratio.
Problem

Research questions and friction points this paper is trying to address.

Predicting stock market index movements using multi-source financial data
Capturing varied investor cognition patterns through dynamic ensemble learning
Improving prediction accuracy and trading strategy performance in Chinese markets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic stacking ensemble model with investor knowledge representations
Tailored neural networks for multi-source financial data properties
Time-window based dynamic meta-classifier selection for patterns
🔎 Similar Papers
No similar papers found.
R
Ruize Gao
Digital Economy Lab, Beijing Institute of Mathematical Sciences and Applications, Beijing, China
Mei Yang
Mei Yang
University of Nevada, Las Vegas
Computer architecturesinterconnection networkscloud computingmachine learning
Y
Yu Wang
School of Economics and Business Administration, Chongqing University, Chongqing, China
Shaoze Cui
Shaoze Cui
Beijing Institute of Technology
business analyticsmachine learningdata miningpredictive modeling