CNN-TFT explained by SHAP with multi-head attention weights for time series forecasting

📅 2025-10-08

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

To address the challenge of jointly modeling local temporal patterns and capturing long-range dependencies—while maintaining interpretability—in multivariate time series forecasting, this paper proposes CNN-TFT-SHAP-MHAW: a hybrid architecture integrating 1D convolutional neural networks (for local feature extraction) and the Temporal Fusion Transformer (TFT) (for global dynamic dependency modeling), augmented by a novel Multi-Head Attention-Weighted SHAP (MHAW-SHAP) method enabling fine-grained feature attribution. Evaluated on hydropower flow forecasting, the model achieves a mean absolute percentage error (MAPE) of 2.2%, substantially outperforming state-of-the-art models including LSTM, TCN, and Informer. Key contributions include: (1) a synergistic CNN-TFT modeling framework; (2) an attention-guided interpretability enhancement mechanism; and (3) a unified high-accuracy–high-interpretabibility solution tailored for industrial time-series applications.

Technology Category

Application Category

📝 Abstract

Convolutional neural networks (CNNs) and transformer architectures offer strengths for modeling temporal data: CNNs excel at capturing local patterns and translational invariances, while transformers effectively model long-range dependencies via self-attention. This paper proposes a hybrid architecture integrating convolutional feature extraction with a temporal fusion transformer (TFT) backbone to enhance multivariate time series forecasting. The CNN module first applies a hierarchy of one-dimensional convolutional layers to distill salient local patterns from raw input sequences, reducing noise and dimensionality. The resulting feature maps are then fed into the TFT, which applies multi-head attention to capture both short- and long-term dependencies and to weigh relevant covariates adaptively. We evaluate the CNN-TFT on a hydroelectric natural flow time series dataset. Experimental results demonstrate that CNN-TFT outperforms well-established deep learning models, with a mean absolute percentage error of up to 2.2%. The explainability of the model is obtained by a proposed Shapley additive explanations with multi-head attention weights (SHAP-MHAW). Our novel architecture, named CNN-TFT-SHAP-MHAW, is promising for applications requiring high-fidelity, multivariate time series forecasts, being available for future analysis at https://github.com/SFStefenon/CNN-TFT-SHAP-MHAW .

Problem

Research questions and friction points this paper is trying to address.

Enhancing multivariate time series forecasting accuracy

Integrating convolutional networks with transformer architectures

Providing explainable predictions using SHAP with attention weights

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid CNN and temporal fusion transformer architecture

Multi-head attention captures short and long-term dependencies

SHAP explainability with multi-head attention weights

🔎 Similar Papers

MCDFN: Supply Chain Demand Forecasting via an Explainable Multi-Channel Data Fusion Network Model Integrating CNN, LSTM, and GRU