A Broader View on Clustering under Cluster-Aware Norm Objectives

📅 2025-12-05

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

This paper studies the $(f,g)$-clustering problem, a unified framework for modeling classical clustering objectives—including $k$-Center, $k$-Median, Min-Sum of Radii, and Min-Load $k$-Clustering. To handle its generalized objective—defined via monotone symmetric norms $f$ and $g$—we propose a novel parameterized interpolation algorithmic framework, enabling the first smooth transition of approximation ratios across distinct clustering models. Our method combines hierarchical approximation with randomized rounding. For $(f,ell_1)$-clustering, we achieve an $O(log^2 n)$ approximation ratio, substantially improving upon the prior $ ilde{O}(sqrt{n})$ bound. For general $(f,g)$-clustering, we obtain an $O(k)$ approximation, matching the best-known upper bound for Min-Load $k$-Clustering and resolving a long-standing gap in the theory of norm-based clustering.

Technology Category

Application Category

📝 Abstract

We revisit the $(f,g)$-clustering problem that we introduced in a recent work [SODA'25], and which subsumes fundamental clustering problems such as $k$-Center, $k$-Median, Min-Sum of Radii, and Min-Load $k$-Clustering. This problem assigns each of the $k$ clusters a cost determined by the monotone, symmetric norm $f$ applied to the vector distances in the cluster, and aims at minimizing the norm $g$ applied to the vector of cluster costs. Previously, we focused on certain special cases for which we designed constant-factor approximation algorithms. Our bounds for more general settings left, however, large gaps to the known bounds for the basic problems they capture. In this work, we provide a clearer picture of the approximability of these more general settings. First, we design an $O(log^2 n)$-approximation algorithm for $(f, L_{1})$-clustering for any $f$. This improves upon our previous $widetilde{O}(sqrt{n})$-approximation. Second, we provide an $O(k)$-approximation for the general $(f,g)$-clustering problem, which improves upon our previous $widetilde{O}(sqrt{kn})$-approximation algorithm and matches the best-known upper bound for Min-Load $k$-Clustering. We then design an approximation algorithm for $(f,g)$-clustering that interpolates, up to polylog factors, between the best known bounds for $k$-Center, $k$-Median, Min-Sum of Radii, Min-Load $k$-Clustering, (Top, $L_{1}$)-clustering, and $(L_{infty},g)$-clustering based on a newly defined parameter of $f$ and $g$.

Problem

Research questions and friction points this paper is trying to address.

Improves approximation bounds for general (f,g)-clustering problems.

Develops algorithms for clustering under cluster-aware norm objectives.

Unifies analysis of fundamental clustering variants like k-Center and k-Median.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops O(log^2 n) approximation for (f, L1)-clustering

Provides O(k) approximation for general (f,g)-clustering problem

Designs algorithm interpolating between best bounds via new parameter

🔎 Similar Papers

Normalised Clustering Accuracy: An Asymmetric External Cluster Validity Measure