🤖 AI Summary
To address the challenge of jointly modeling local temporal patterns and capturing long-range dependencies—while maintaining interpretability—in multivariate time series forecasting, this paper proposes CNN-TFT-SHAP-MHAW: a hybrid architecture integrating 1D convolutional neural networks (for local feature extraction) and the Temporal Fusion Transformer (TFT) (for global dynamic dependency modeling), augmented by a novel Multi-Head Attention-Weighted SHAP (MHAW-SHAP) method enabling fine-grained feature attribution. Evaluated on hydropower flow forecasting, the model achieves a mean absolute percentage error (MAPE) of 2.2%, substantially outperforming state-of-the-art models including LSTM, TCN, and Informer. Key contributions include: (1) a synergistic CNN-TFT modeling framework; (2) an attention-guided interpretability enhancement mechanism; and (3) a unified high-accuracy–high-interpretabibility solution tailored for industrial time-series applications.
📝 Abstract
Convolutional neural networks (CNNs) and transformer architectures offer strengths for modeling temporal data: CNNs excel at capturing local patterns and translational invariances, while transformers effectively model long-range dependencies via self-attention. This paper proposes a hybrid architecture integrating convolutional feature extraction with a temporal fusion transformer (TFT) backbone to enhance multivariate time series forecasting. The CNN module first applies a hierarchy of one-dimensional convolutional layers to distill salient local patterns from raw input sequences, reducing noise and dimensionality. The resulting feature maps are then fed into the TFT, which applies multi-head attention to capture both short- and long-term dependencies and to weigh relevant covariates adaptively. We evaluate the CNN-TFT on a hydroelectric natural flow time series dataset. Experimental results demonstrate that CNN-TFT outperforms well-established deep learning models, with a mean absolute percentage error of up to 2.2%. The explainability of the model is obtained by a proposed Shapley additive explanations with multi-head attention weights (SHAP-MHAW). Our novel architecture, named CNN-TFT-SHAP-MHAW, is promising for applications requiring high-fidelity, multivariate time series forecasts, being available for future analysis at https://github.com/SFStefenon/CNN-TFT-SHAP-MHAW .