🤖 AI Summary
To address the lack of metric properties in Dynamic Time Warping (DTW) and the outlier sensitivity of the Fréchet distance for polygonal curve similarity measurement, this paper proposes *k*-Dynamic Time Warping (*k*-DTW), a novel dissimilarity measure that is both a strict metric and robust to outliers. The core innovation lies in a *k*-order matching distance parameterization mechanism and the first dimension-independent generalization bound for curve learning. Theoretically, *k*-DTW achieves strictly lower Rademacher and Gaussian complexities than DTW—yielding an Ω(√m) separation—and admits a tighter generalization error bound, significantly reducing sample complexity for median curve learning. Algorithmically, we provide both exact and (1+ε)-approximate solutions. Extensive experiments demonstrate that *k*-DTW consistently outperforms DTW and the Fréchet distance on clustering and nearest-neighbor classification tasks.
📝 Abstract
This paper introduces $k$-Dynamic Time Warping ($k$-DTW), a novel dissimilarity measure for polygonal curves. $k$-DTW has stronger metric properties than Dynamic Time Warping (DTW) and is more robust to outliers than the Fr'{e}chet distance, which are the two gold standards of dissimilarity measures for polygonal curves. We show interesting properties of $k$-DTW and give an exact algorithm as well as a $(1+varepsilon)$-approximation algorithm for $k$-DTW by a parametric search for the $k$-th largest matched distance. We prove the first dimension-free learning bounds for curves and further learning theoretic results. $k$-DTW not only admits smaller sample size than DTW for the problem of learning the median of curves, where some factors depending on the curves' complexity $m$ are replaced by $k$, but we also show a surprising separation on the associated Rademacher and Gaussian complexities: $k$-DTW admits strictly smaller bounds than DTW, by a factor $ ildeOmega(sqrt{m})$ when $kll m$. We complement our theoretical findings with an experimental illustration of the benefits of using $k$-DTW for clustering and nearest neighbor classification.