🤖 AI Summary
This paper studies the $(k,ell)$-median clustering problem for time series under the discrete Fréchet distance, where sequences have variable length $m$ and $k$ may be as large as $Omega(n)$. To overcome the scalability bottleneck of existing methods—whose runtime depends polynomially on input size $n$—we propose the first near-linear-time $(1+varepsilon)$-approximation algorithm, for constant $ell$ and $varepsilon$. Our method introduces a novel dimensionality reduction technique for the Fréchet distance, enabling the construction of a coreset whose size is independent of both $n$ and $m$. This coreset is then integrated with the clustering framework of Cohen-Addad et al., allowing efficient clustering in the reduced space. Experiments demonstrate that our approach significantly improves both efficiency and scalability for large-scale time-series clustering under the Fréchet metric.
📝 Abstract
A time series of complexity $m$ is a sequence of $m$ real valued measurements. The discrete Fréchet distance $d_{dF}(x,y)$ is a distance measure between two time series $x$ and $y$ of possibly different complexity. Given a set of $n$ time series represented as $m$-dimensional vectors over the reals, the $(k,ell)$-median problem under discrete Fréchet distance aims to find a set $C$ of $k$ time series of complexity $ell$ such that $$sum_{xin P} min_{cin C} d_{dF}(x,c)$$ is minimized. In this paper, we give the first near-linear time $(1+varepsilon)$-approximation algorithm for this problem when $ell$ and $varepsilon$ are constants but $k$ can be as large as $Ω(n)$. We obtain our result by introducing a new dimension reduction technique for discrete Fréchet distance and then adapt an algorithm of Cohen-Addad et al. (J. ACM 2021) to work on the dimension-reduced input. As a byproduct we also improve the best coreset construction for $(k,ell)$-median under discrete Fréchet distance (Cohen-Addad et al., SODA 2025) and show that its size can be independent of the number of input time series emph{ and } their complexity.