Deterministic $k$-Median Clustering in Near-Optimal Time

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
解决度量k-中值聚类问题,提出确定性算法在近线性时间内获得对数近似比,填补随机与确定性算法间的性能差距。

Technology Category

Application Category

📝 Abstract
The metric $k$-median problem is a textbook clustering problem. As input, we are given a metric space $V$ of size $n$ and an integer $k$, and our task is to find a subset $S subseteq V$ of at most $k$ `centers' that minimizes the total distance from each point in $V$ to its nearest center in $S$. Mettu and Plaxton [UAI'02] gave a randomized algorithm for $k$-median that computes a $O(1)$-approximation in $ ilde O(nk)$ time. They also showed that any algorithm for this problem with a bounded approximation ratio must have a running time of $Omega(nk)$. Thus, the running time of their algorithm is optimal up to polylogarithmic factors. For deterministic $k$-median, Guha et al.~[FOCS'00] gave an algorithm that computes a $ ext{poly}(log (n/k))$-approximation in $ ilde O(nk)$ time, where the degree of the polynomial in the approximation is unspecified. To the best of our knowledge, this remains the state-of-the-art approximation of any deterministic $k$-median algorithm with this running time. This leads us to the following natural question: What is the best approximation of a deterministic $k$-median algorithm with near-optimal running time? We make progress in answering this question by giving a deterministic algorithm that computes a $O(log(n/k))$-approximation in $ ilde O(nk)$ time. We also provide a lower bound showing that any deterministic algorithm with this running time must have an approximation ratio of $Omega(log n/(log k + log log n))$, establishing a gap between the randomized and deterministic settings for $k$-median.
Problem

Research questions and friction points this paper is trying to address.

Deterministic k-median clustering with near-optimal time
Improving approximation ratio for deterministic k-median algorithms
Establishing gap between randomized and deterministic k-median performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deterministic algorithm for k-median clustering
O(log(n/k))-approximation in near-optimal time
Lower bound for deterministic approximation ratio
🔎 Similar Papers
No similar papers found.