🤖 AI Summary
To address the limitations of supervised data augmentation, high annotation costs, and suboptimal downstream performance in unsupervised time-series representation learning, this paper proposes a dynamic contrastive learning framework. Instead of relying on fixed views or hand-crafted augmentations, it constructs positive pairs directly from temporally adjacent steps. A dynamic N-pair contrastive loss is introduced to adaptively modulate positive and negative sample selection. Crucially, we empirically reveal—for the first time—a significant inconsistency between unsupervised clustering metrics (e.g., Normalized Mutual Information) and downstream task performance, highlighting a critical evaluation bias. Extensive experiments across multiple benchmark datasets demonstrate state-of-the-art results on both classification and anomaly detection tasks. Moreover, the learned embeddings exhibit strong semantic clustering structure, confirming the joint improvement of representation quality and downstream generalization capability.
📝 Abstract
Understanding events in time series is an important task in a variety of contexts. However, human analysis and labeling are expensive and time-consuming. Therefore, it is advantageous to learn embeddings for moments in time series in an unsupervised way, which allows for good performance in classification or detection tasks after later minimal human labeling. In this paper, we propose dynamic contrastive learning (DynaCL), an unsupervised contrastive representation learning framework for time series that uses temporal adjacent steps to define positive pairs. DynaCL adopts N-pair loss to dynamically treat all samples in a batch as positive or negative pairs, enabling efficient training and addressing the challenges of complicated sampling of positives. We demonstrate that DynaCL embeds instances from time series into semantically meaningful clusters, which allows superior performance on downstream tasks on a variety of public time series datasets. Our findings also reveal that high scores on unsupervised clustering metrics do not guarantee that the representations are useful in downstream tasks.