🤖 AI Summary
This paper studies the $(f,g)$-clustering problem, a unified framework for modeling classical clustering objectives—including $k$-Center, $k$-Median, Min-Sum of Radii, and Min-Load $k$-Clustering. To handle its generalized objective—defined via monotone symmetric norms $f$ and $g$—we propose a novel parameterized interpolation algorithmic framework, enabling the first smooth transition of approximation ratios across distinct clustering models. Our method combines hierarchical approximation with randomized rounding. For $(f,ell_1)$-clustering, we achieve an $O(log^2 n)$ approximation ratio, substantially improving upon the prior $ ilde{O}(sqrt{n})$ bound. For general $(f,g)$-clustering, we obtain an $O(k)$ approximation, matching the best-known upper bound for Min-Load $k$-Clustering and resolving a long-standing gap in the theory of norm-based clustering.
📝 Abstract
We revisit the $(f,g)$-clustering problem that we introduced in a recent work [SODA'25], and which subsumes fundamental clustering problems such as $k$-Center, $k$-Median, Min-Sum of Radii, and Min-Load $k$-Clustering. This problem assigns each of the $k$ clusters a cost determined by the monotone, symmetric norm $f$ applied to the vector distances in the cluster, and aims at minimizing the norm $g$ applied to the vector of cluster costs. Previously, we focused on certain special cases for which we designed constant-factor approximation algorithms. Our bounds for more general settings left, however, large gaps to the known bounds for the basic problems they capture.
In this work, we provide a clearer picture of the approximability of these more general settings. First, we design an $O(log^2 n)$-approximation algorithm for $(f, L_{1})$-clustering for any $f$. This improves upon our previous $widetilde{O}(sqrt{n})$-approximation. Second, we provide an $O(k)$-approximation for the general $(f,g)$-clustering problem, which improves upon our previous $widetilde{O}(sqrt{kn})$-approximation algorithm and matches the best-known upper bound for Min-Load $k$-Clustering.
We then design an approximation algorithm for $(f,g)$-clustering that interpolates, up to polylog factors, between the best known bounds for $k$-Center, $k$-Median, Min-Sum of Radii, Min-Load $k$-Clustering, (Top, $L_{1}$)-clustering, and $(L_{infty},g)$-clustering based on a newly defined parameter of $f$ and $g$.