Almost-Optimal Upper and Lower Bounds for Clustering in Low Dimensional Euclidean Spaces

📅 2026-03-10

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

This work addresses the prohibitively high running time of existing $(1+\varepsilon)$-approximation algorithms for $k$-median and $k$-means clustering in low-dimensional Euclidean space. By introducing a refined geometric decomposition scheme combined with dynamic programming, the authors significantly reduce the exponential dependence in the runtime from $2^{(1/\varepsilon)^{O(d^2)}}$ to $2^{\widetilde{O}((1/\varepsilon)^{d-1})}$. Moreover, under the Gap Exponential Time Hypothesis (Gap ETH), they establish a matching conditional lower bound, showing that no $(1+\varepsilon)$-approximation algorithm can run in time $2^{o((1/\varepsilon)^{d-1})} \cdot n^{O(1)}$. This result yields the first near-linear-time $(1+\varepsilon)$-approximation algorithms for these problems and provides nearly tight complexity bounds.

Technology Category

Application Category

📝 Abstract

The $k$-median and $k$-means clustering objectives are classic objectives for modeling clustering in a metric space. Given a set of points in a metric space, the goal of the $k$-median (resp. $k$-means) problem is to find $k$ representative points so as to minimize the sum of the distances (resp. sum of squared distances) from each point to its closest representative. Cohen-Addad, Feldmann, and Saulpic [JACM'21] showed how to obtain a $(1+\varepsilon)$-factor approximation in low-dimensional Euclidean metric for both the $k$-median and $k$-means problems in near-linear time $2^{(1/\varepsilon)^{O(d^2)}} n \cdot \text{polylog}(n)$ (where $d$ is the dimension and $n$ is the number of input points). We improve this running time to $2^{\tilde{O}(1/\varepsilon)^{d-1}} \cdot n \cdot \text{polylog}(n)$, and show an almost matching lower bound: under the Gap Exponential Time Hypothesis for 3-SAT, there is no $2^{{o}(1/\varepsilon^{d-1})} n^{O(1)}$ algorithm achieving a $(1+\varepsilon)$-approximation for $k$-means.

Problem

Research questions and friction points this paper is trying to address.

k-median

k-means

clustering

low-dimensional Euclidean space

approximation algorithms

Innovation

Methods, ideas, or system contributions that make the work stand out.

k-means clustering

low-dimensional Euclidean space

approximation algorithms