🤖 AI Summary
This paper studies diversity-aware clustering, where data points possess multiple attributes causing group overlaps, and each group must select a number of cluster centers within prescribed lower and upper bounds, while minimizing the $k$-median, $k$-means, or $k$-supplier objective. We first establish the problem’s computational complexity: it is NP-hard, W[1]-hard with respect to natural parameters, and—under the Gap-ETH assumption—admits tight inapproximability bounds. We then design parameterized approximation algorithms achieving tight approximation ratios of 1.736, 3.943, and 5 for the $k$-median, $k$-means, and $k$-supplier variants, respectively. Our results extend fairness-aware clustering theory from the restrictive disjoint-group setting to the more realistic overlapping-group scenario. To the best of our knowledge, this work provides the first solution that simultaneously achieves theoretical tightness and broad model applicability for clustering under diversity constraints.
📝 Abstract
In this work, we study diversity-aware clustering problems where the data points are associated with multiple attributes resulting in intersecting groups. A clustering solution needs to ensure that the number of chosen cluster centers from each group should be within the range defined by a lower and upper bound threshold for each group, while simultaneously minimizing the clustering objective, which can be either $k$-median, $k$-means or $k$-supplier. We study the computational complexity of the proposed problems, offering insights into their NP-hardness, polynomial-time inapproximability, and fixed-parameter intractability. We present parameterized approximation algorithms with approximation ratios $1+ frac{2}{e} + epsilon approx 1.736$, $1+frac{8}{e} + epsilon approx 3.943$, and $5$ for diversity-aware $k$-median, diversity-aware $k$-means and diversity-aware $k$-supplier, respectively. Assuming Gap-ETH, the approximation ratios are tight for the diversity-aware $k$-median and diversity-aware $k$-means problems. Our results imply the same approximation factors for their respective fair variants with disjoint groups -- fair $k$-median, fair $k$-means, and fair $k$-supplier -- with lower bound requirements.