Dynamic stacking ensemble learning with investor knowledge representations for stock market index prediction based on multi-source financial data

📅 2025-12-15

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

This study addresses two key challenges in financial forecasting: (1) the difficulty of fusing heterogeneous multi-source data (e.g., global/sectoral indices and financial news), and (2) insufficient modeling of investor cognitive heterogeneity. To tackle these, we propose a two-stage dynamic stacking ensemble framework. In Stage I, investor knowledge representation is leveraged to extract discriminative multimodal features via an LSTM-CNN-BERT fusion architecture. In Stage II, a cognition-aware data attribute identification mechanism and a time-window-adaptive meta-classifier online switching strategy are introduced. Our key innovations include embedding investor cognition into the dynamic stacking architecture and designing a prediction-confidence-driven meta-model selection mechanism. Empirical evaluation on intraday direction prediction for the Shanghai Composite Index, Shenzhen Component Index, and ChiNext Index demonstrates accuracy improvements of 1.42%, 7.94%, and 7.73%, respectively. Live trading simulations further confirm substantial gains in cumulative returns and Sharpe ratio.

Technology Category

Application Category

📝 Abstract

The patterns of different financial data sources vary substantially, and accordingly, investors exhibit heterogeneous cognition behavior in information processing. To capture different patterns, we propose a novel approach called the two-stage dynamic stacking ensemble model based on investor knowledge representations, which aims to effectively extract and integrate the features from multi-source financial data. In the first stage, we identify different financial data property from global stock market indices, industrial indices, and financial news based on the perspective of investors. And then, we design appropriate neural network architectures tailored to these properties to generate effective feature representations. Based on learned feature representations, we design multiple meta-classifiers and dynamically select the optimal one for each time window, enabling the model to effectively capture and learn the distinct patterns that emerge across different temporal periods. To evaluate the performance of the proposed model, we apply it to predicting the daily movement of Shanghai Securities Composite index, SZSE Component index and Growth Enterprise index in Chinese stock market. The experimental results demonstrate the effectiveness of our model in improving the prediction performance. In terms of accuracy metric, our approach outperforms the best competing models by 1.42%, 7.94%, and 7.73% on the SSEC, SZEC, and GEI indices, respectively. In addition, we design a trading strategy based on the proposed model. The economic results show that compared to the competing trading strategies, our strategy delivers a superior performance in terms of the accumulated return and Sharpe ratio.

Problem

Research questions and friction points this paper is trying to address.

Predicting stock market index movements using multi-source financial data

Capturing varied investor cognition patterns through dynamic ensemble learning

Improving prediction accuracy and trading strategy performance in Chinese markets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic stacking ensemble model with investor knowledge representations

Tailored neural networks for multi-source financial data properties

Time-window based dynamic meta-classifier selection for patterns

🔎 Similar Papers

No similar papers found.