🤖 AI Summary
Time-series forecasting models often lack interpretability, and existing post-hoc explanation methods (e.g., LIME) are ill-suited for forecasting tasks. To address this, we propose PAX-TS—a model-agnostic, multi-granularity post-hoc explanation framework that generates explanations via localized input perturbations, integrating temporal step-wise correlation matrices and cross-channel dependency modeling to support both univariate and multivariate sequence analysis. Our work is the first to systematically identify six distinct time-series explanation patterns significantly correlated with forecasting performance, revealing fundamental differences in explanation behavior between high- and low-performing models. Extensive evaluation across seven forecasting models and ten benchmark datasets demonstrates that PAX-TS consistently produces interpretable patterns, quantifies the association between explanation features and prediction errors, and enables predictive attribution and diagnostic analysis.
📝 Abstract
Time series forecasting has seen considerable improvement during the last years, with transformer models and large language models driving advancements of the state of the art. Modern forecasting models are generally opaque and do not provide explanations for their forecasts, while well-known post-hoc explainability methods like LIME are not suitable for the forecasting context. We propose PAX-TS, a model-agnostic post-hoc algorithm to explain time series forecasting models and their forecasts. Our method is based on localized input perturbations and results in multi-granular explanations. Further, it is able to characterize cross-channel correlations for multivariate time series forecasts. We clearly outline the algorithmic procedure behind PAX-TS, demonstrate it on a benchmark with 7 algorithms and 10 diverse datasets, compare it with two other state-of-the-art explanation algorithms, and present the different explanation types of the method. We found that the explanations of high-performing and low-performing algorithms differ on the same datasets, highlighting that the explanations of PAX-TS effectively capture a model's behavior. Based on time step correlation matrices resulting from the benchmark, we identify 6 classes of patterns that repeatedly occur across different datasets and algorithms. We found that the patterns are indicators of performance, with noticeable differences in forecasting error between the classes. Lastly, we outline a multivariate example where PAX-TS demonstrates how the forecasting model takes cross-channel correlations into account. With PAX-TS, time series forecasting models' mechanisms can be illustrated in different levels of detail, and its explanations can be used to answer practical questions on forecasts.