🤖 AI Summary
Deep learning models for time-series classification often exploit spurious correlations—“shortcuts”—in training data, undermining generalization; however, prior work has not systematically characterized shortcut learning at the *point level*. This paper formally defines and empirically validates point-level shortcuts in time-series classification. We propose a gradient-driven detection method that requires no test set, clean labels, or external attributes: by analyzing inter-class differences in neural gradients, it localizes anomalous time points critical to model decisions. Experiments across multiple UCR benchmark datasets demonstrate that our approach effectively identifies internal model biases, achieving high robustness and strong generalization under fully unsupervised conditions. Our work establishes a novel paradigm for interpretable time-series modeling and robust training.
📝 Abstract
Deep learning models have attracted lots of research attention in time series classification (TSC) task in the past two decades. Recently, deep neural networks (DNN) have surpassed classical distance-based methods and achieved state-of-the-art performance. Despite their promising performance, deep neural networks (DNNs) have been shown to rely on spurious correlations present in the training data, which can hinder generalization. For instance, a model might incorrectly associate the presence of grass with the label ``cat" if the training set have majority of cats lying in grassy backgrounds. However, the shortcut behavior of DNNs in time series remain under-explored. Most existing shortcut work are relying on external attributes such as gender, patients group, instead of focus on the internal bias behavior in time series models.
In this paper, we take the first step to investigate and establish point-based shortcut learning behavior in deep learning time series classification. We further propose a simple detection method based on other class to detect shortcut occurs without relying on test data or clean training classes. We test our proposed method in UCR time series datasets.