Exposing Vulnerabilities in Explanation for Time Series Classifiers via Dual-Target Attacks

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses a critical misconception in time series classification: the assumption that explanation consistency reliably indicates decision robustness, overlooking the risk that predictions and explanations can be maliciously decoupled. To expose this vulnerability, we propose TSEF (Time Series Explanation Fooler), a dual-objective adversarial attack framework that simultaneously manipulates both the classifier’s prediction and the attribution output of interpreters—such as attention mechanisms or saliency maps—via gradient-based optimization. TSEF achieves high targeted misclassification rates while forcing explanations to align with a specified reference rationale. Extensive experiments across multiple datasets and explanation methods demonstrate TSEF’s effectiveness, revealing for the first time that explanation stability is not a valid proxy for decision robustness and underscoring the necessity of jointly evaluating the coupling robustness of predictions and their explanations.

Technology Category

Application Category

📝 Abstract

Interpretable time series deep learning systems are often assessed by checking temporal consistency on explanations, implicitly treating this as evidence of robustness. We show that this assumption can fail: Predictions and explanations can be adversarially decoupled, enabling targeted misclassification while the explanation remains plausible and consistent with a chosen reference rationale. We propose TSEF (Time Series Explanation Fooler), a dual-target attack that jointly manipulates the classifier and explainer outputs. In contrast to single-objective misclassification attacks that disrupt explanation and spread attribution mass broadly, TSEF achieves targeted prediction changes while keeping explanations consistent with the reference. Across multiple datasets and explainer backbones, our results consistently reveal that explanation stability is a misleading proxy for decision robustness and motivate coupling-aware robustness evaluations for trustworthy time series tasks.

Problem

Research questions and friction points this paper is trying to address.

time series

interpretability

adversarial attacks

explanation robustness

dual-target attacks

Innovation

Methods, ideas, or system contributions that make the work stand out.

dual-target attack

time series explanation

adversarial decoupling