Almost-Optimal Upper and Lower Bounds for Clustering in Low Dimensional Euclidean Spaces

📅 2026-03-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the prohibitively high running time of existing $(1+\varepsilon)$-approximation algorithms for $k$-median and $k$-means clustering in low-dimensional Euclidean space. By introducing a refined geometric decomposition scheme combined with dynamic programming, the authors significantly reduce the exponential dependence in the runtime from $2^{(1/\varepsilon)^{O(d^2)}}$ to $2^{\widetilde{O}((1/\varepsilon)^{d-1})}$. Moreover, under the Gap Exponential Time Hypothesis (Gap ETH), they establish a matching conditional lower bound, showing that no $(1+\varepsilon)$-approximation algorithm can run in time $2^{o((1/\varepsilon)^{d-1})} \cdot n^{O(1)}$. This result yields the first near-linear-time $(1+\varepsilon)$-approximation algorithms for these problems and provides nearly tight complexity bounds.

Technology Category

Application Category

📝 Abstract
The $k$-median and $k$-means clustering objectives are classic objectives for modeling clustering in a metric space. Given a set of points in a metric space, the goal of the $k$-median (resp. $k$-means) problem is to find $k$ representative points so as to minimize the sum of the distances (resp. sum of squared distances) from each point to its closest representative. Cohen-Addad, Feldmann, and Saulpic [JACM'21] showed how to obtain a $(1+\varepsilon)$-factor approximation in low-dimensional Euclidean metric for both the $k$-median and $k$-means problems in near-linear time $2^{(1/\varepsilon)^{O(d^2)}} n \cdot \text{polylog}(n)$ (where $d$ is the dimension and $n$ is the number of input points). We improve this running time to $2^{\tilde{O}(1/\varepsilon)^{d-1}} \cdot n \cdot \text{polylog}(n)$, and show an almost matching lower bound: under the Gap Exponential Time Hypothesis for 3-SAT, there is no $2^{{o}(1/\varepsilon^{d-1})} n^{O(1)}$ algorithm achieving a $(1+\varepsilon)$-approximation for $k$-means.
Problem

Research questions and friction points this paper is trying to address.

k-median
k-means
clustering
low-dimensional Euclidean space
approximation algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

k-means clustering
low-dimensional Euclidean space
approximation algorithms
computational complexity
Exponential Time Hypothesis
🔎 Similar Papers
No similar papers found.