Faithful and Interpretable Explanations for Complex Ensemble Time Series Forecasts using Surrogate Models and Forecastability Analysis

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
AutoML-based ensemble time-series forecasting models suffer from poor interpretability. Method: This paper proposes a dual-track interpretability framework integrating surrogate modeling and predictability analysis. First, a high-fidelity LightGBM surrogate model is trained to approximate the black-box AutoML model (e.g., AutoGluon), with local feature attributions derived via SHAP. Second, spectral predictability analysis quantifies intrinsic series predictability by comparing spectral density estimates against a noise baseline, serving as a confidence metric for explanations. Third, a per-item normalization strategy accommodates multi-scale, heterogeneous time series. Results: Experiments on the M5 dataset show that spectral predictability is significantly positively correlated with both forecasting accuracy and surrogate fidelity. The framework effectively identifies low-confidence predictions, enhances explanation stability, and improves user trust calibration.

Technology Category

Application Category

📝 Abstract
Modern time series forecasting increasingly relies on complex ensemble models generated by AutoML systems like AutoGluon, delivering superior accuracy but with significant costs to transparency and interpretability. This paper introduces a comprehensive, dual-approach framework that addresses both the explainability and forecastability challenges in complex time series ensembles. First, we develop a surrogate-based explanation methodology that bridges the accuracy-interpretability gap by training a LightGBM model to faithfully mimic AutoGluon's time series forecasts, enabling stable SHAP-based feature attributions. We rigorously validated this approach through feature injection experiments, demonstrating remarkably high faithfulness between extracted SHAP values and known ground truth effects. Second, we integrated spectral predictability analysis to quantify each series' inherent forecastability. By comparing each time series' spectral predictability to its pure noise benchmarks, we established an objective mechanism to gauge confidence in forecasts and their explanations. Our empirical evaluation on the M5 dataset found that higher spectral predictability strongly correlates not only with improved forecast accuracy but also with higher fidelity between the surrogate and the original forecasting model. These forecastability metrics serve as effective filtering mechanisms and confidence scores, enabling users to calibrate their trust in both the forecasts and their explanations. We further demonstrated that per-item normalization is essential for generating meaningful SHAP explanations across heterogeneous time series with varying scales. The resulting framework delivers interpretable, instance-level explanations for state-of-the-art ensemble forecasts, while equipping users with forecastability metrics that serve as reliability indicators for both predictions and their explanations.
Problem

Research questions and friction points this paper is trying to address.

Explaining complex ensemble time series forecasts for transparency and interpretability
Quantifying time series forecastability to gauge prediction confidence
Generating faithful explanations across heterogeneous time series with varying scales
Innovation

Methods, ideas, or system contributions that make the work stand out.

Surrogate LightGBM model mimics AutoGluon forecasts
Spectral predictability analysis quantifies series forecastability
Per-item normalization enables meaningful SHAP explanations
🔎 Similar Papers
No similar papers found.