Inferring the Most Similar Variable-length Subsequences between Multidimensional Time Series

📅 2025-05-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the problem of variable-length most-similar subsequence matching among multivariate time series. We propose the first exact algorithm that is both theoretically sound and computationally efficient. Our method builds upon a dynamic programming framework, integrating aggressive pruning strategies and a multivariate distance metric to enable robust alignment across sequences of disparate lengths and under local temporal deformations. Crucially, it is the first approach to guarantee exact matching for variable-length subsequences—unlike existing methods, which rely on approximations or fixed-length assumptions. Experimental evaluation demonstrates up to 4× speedup on synthetic data and up to 20× acceleration on real-world financial and animal behavioral datasets. The method successfully uncovers latent cross-market stock correlations and coordinated movement patterns in baboon groups, thereby validating its effectiveness and practical utility in complex multivariate time-series analysis.

Technology Category

Application Category

📝 Abstract
Finding the most similar subsequences between two multidimensional time series has many applications: e.g. capturing dependency in stock market or discovering coordinated movement of baboons. Considering one pattern occurring in one time series, we might be wondering whether the same pattern occurs in another time series with some distortion that might have a different length. Nevertheless, to the best of our knowledge, there is no efficient framework that deals with this problem yet. In this work, we propose an algorithm that provides the exact solution of finding the most similar multidimensional subsequences between time series where there is a difference in length both between time series and between subsequences. The algorithm is built based on theoretical guarantee of correctness and efficiency. The result in simulation datasets illustrated that our approach not just only provided correct solution, but it also utilized running time only quarter of time compared against the baseline approaches. In real-world datasets, it extracted the most similar subsequences even faster (up to 20 times faster against baseline methods) and provided insights regarding the situation in stock market and following relations of multidimensional time series of baboon movement. Our approach can be used for any time series. The code and datasets of this work are provided for the public use.
Problem

Research questions and friction points this paper is trying to address.

Finding similar variable-length subsequences in multidimensional time series
Addressing lack of efficient framework for distorted pattern matching
Proposing exact solution with theoretical correctness and efficiency guarantees
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exact algorithm for variable-length subsequence similarity
Theoretical guarantees for correctness and efficiency
Significantly faster than baseline methods
🔎 Similar Papers
No similar papers found.
T
Thanadej Rattanakornphan
Department of Computer Engineering, Kasetsart University
P
Piyanon Charoenpoonpanich
Independent
Chainarong Amornbunchornvej
Chainarong Amornbunchornvej
Researcher at National Electronics and Computer Technology Center, Thailand
Time Series MiningData miningMachine learningBioinformaticsData Science