Enforcing Interpretability in Time Series Transformers: A Concept Bottleneck Framework

📅 2024-10-08
🏛️ arXiv.org
📈 Citations: 3
Influential: 1
📄 PDF

career value

196K/year
🤖 AI Summary
Time-series Transformers suffer from poor interpretability, hindering trust and debugging. Method: This paper proposes a concept-bottleneck-driven, intervenable modeling paradigm: predefined interpretable concepts (e.g., temporal features, AR dynamics) are embedded as structured constraints in latent layers to enforce local alignment between representations and human-understandable concepts. We integrate the concept bottleneck mechanism into the Autoformer architecture, employ Centered Kernel Alignment (CKA) to optimize concept consistency across layers, and construct an interpretable AR surrogate model. Crucially, the framework supports real-time intervention—such as time-shift correction—without retraining. Results: On multiple benchmark datasets, our method matches the original Autoformer’s predictive accuracy while substantially enhancing both feature-level and decision-level interpretability, demonstrating strong effectiveness and practical utility.

Technology Category

Application Category

📝 Abstract
There has been a recent push of research on Transformer-based models for long-term time series forecasting, even though they are inherently difficult to interpret and explain. While there is a large body of work on interpretability methods for various domains and architectures, the interpretability of Transformer-based forecasting models remains largely unexplored. To address this gap, we develop a framework based on Concept Bottleneck Models to enforce interpretability of time series Transformers. We modify the training objective to encourage a model to develop representations similar to predefined interpretable concepts. In our experiments, we enforce similarity using Centered Kernel Alignment, and the predefined concepts include time features and an interpretable, autoregressive surrogate model (AR). We apply the framework to the Autoformer model, and present an in-depth analysis for a variety of benchmark tasks. We find that the model performance remains mostly unaffected, while the model shows much improved interpretability. Additionally, interpretable concepts become local, which makes the trained model easily intervenable. As a proof of concept, we demonstrate a successful intervention in the scenario of a time shift in the data, which eliminates the need to retrain.
Problem

Research questions and friction points this paper is trying to address.

Enhancing interpretability of time series transformers
Using concept bottleneck framework for forward engineering
Applying to forecasting models with minimal performance impact
Innovation

Methods, ideas, or system contributions that make the work stand out.

Concept Bottleneck Framework for interpretability
Centered Kernel Alignment to align representations with concepts
Modify training objective to learn predefined interpretable concepts
🔎 Similar Papers
No similar papers found.