Near-Optimal Bounds for Parameterized Euclidean k-means

📅 2026-03-30

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

This work investigates the approximation limits of the parameterized $k$-means problem in Euclidean space, with a focus on the trade-off between running time and approximation ratio. By introducing a new fine-grained complexity hypothesis—the Exponential-Time Expansion Hypothesis (XXH)—and combining tools from computational complexity theory and graph expansion analysis, the authors establish a nearly tight lower bound on the running time of $(1+\varepsilon)$-approximation algorithms for $k$-means. Under XXH, they prove that no $(1+\varepsilon)$-approximation algorithm can run in time $2^{(k/\varepsilon)^{1-o(1)}} \cdot n^{O(1)}$, and further show that existing exact algorithms are already optimal for small $k$. This result provides the first characterization of the approximability boundary of $k$-means within the fine-grained complexity framework.

Technology Category

Application Category

📝 Abstract

The $k$-means problem is a classic objective for modeling clustering in a metric space. Given a set of points in a metric space, the goal is to find $k$ representative points so as to minimize the sum of the squared distances from each point to its closest representative. In this work, we study the approximability of $k$-means in Euclidean spaces parameterized by the number of clusters, $k$. In seminal works, de la Vega, Karpinski, Kenyon, and Rabani [STOC'03] and Kumar, Sabharwal, and Sen [JACM'10] showed how to obtain a $(1+\varepsilon)$-approximation for high-dimensional Euclidean $k$-means in time $2^{(k/\varepsilon)^{O(1)}} \cdot dn^{O(1)}$. In this work, we introduce a new fine-grained hypothesis called Exponential Time for Expanders Hypothesis (XXH) which roughly asserts that there are no non-trivial exponential time approximation algorithms for the vertex cover problem on near perfect vertex expanders. Assuming XXH, we close the above long line of work on approximating Euclidean $k$-means by showing that there is no $2^{(k/\varepsilon)^{1-o(1)}} \cdot n^{O(1)}$ time algorithm achieving a $(1+\varepsilon)$-approximation for $k$-means in Euclidean space. This lower bound is tight as it matches the algorithm given by Feldman, Monemizadeh, and Sohler [SoCG'07] whose runtime is $2^{\tilde{O}(k/\varepsilon)} + O(ndk)$. Furthermore, assuming XXH, we show that the seminal $O(n^{kd+1})$ runtime exact algorithm of Inaba, Katoh, and Imai [SoCG'94] for $k$-means is optimal for small values of $k$.

Problem

Research questions and friction points this paper is trying to address.

k-means

Euclidean space

parameterized complexity

approximation algorithms

computational hardness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exponential Time for Expanders Hypothesis

parameterized k-means

near-optimal lower bounds