Connected k-Median with Disjoint and Non-disjoint Clusters

📅 2025-07-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies the connectivity-constrained $k$-median problem: given a point set in a metric space and an unweighted, connected graph $G$ on the same vertex set, partition the points into $k$ clusters such that each cluster induces a connected subgraph in $G$, while minimizing the $k$-median objective. We introduce the first model supporting cluster overlap and design an $O(k^2 log n)$-approximation algorithm. We prove that the non-overlapping variant is $Omega(n^{1-varepsilon})$-hard to approximate on general graphs, assuming $ ext{P} eq ext{NP}$. For tree-structured connectivity graphs, we provide a polynomial-time exact algorithm. Our key contributions lie in unifying overlapping and non-overlapping settings under a single framework, integrating graph connectivity analysis, decomposition-and-rounding techniques, and combinatorial optimization—thereby significantly advancing the theoretical foundations and algorithmic design for constrained clustering.

Technology Category

Application Category

📝 Abstract
The connected $k$-median problem is a constrained clustering problem that combines distance-based $k$-clustering with connectivity information. The problem allows to input a metric space and an unweighted undirected connectivity graph that is completely unrelated to the metric space. The goal is to compute $k$ centers and corresponding clusters such that each cluster forms a connected subgraph of $G$, and such that the $k$-median cost is minimized. The problem has applications in very different fields like geodesy (particularly districting), social network analysis (especially community detection), or bioinformatics. We study a version with overlapping clusters where points can be part of multiple clusters which is natural for the use case of community detection. This problem variant is $Ω(log n)$-hard to approximate, and our main result is an $mathcal{O}(k^2 log n)$-approximation algorithm for the problem. We complement it with an $Ω(n^{1-ε})$-hardness result for the case of disjoint clusters without overlap with general connectivity graphs, as well as an exact algorithm in this setting if the connectivity graph is a tree.
Problem

Research questions and friction points this paper is trying to address.

Combines k-median clustering with graph connectivity constraints
Addresses overlapping clusters for community detection applications
Provides approximation algorithms for NP-hard connected k-median variants
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines k-median clustering with graph connectivity
Handles overlapping clusters for community detection
Provides approximation and exact algorithms
🔎 Similar Papers
No similar papers found.
J
Jan Eube
University of Bonn, Germany
K
Kelin Luo
University at Buffalo, United States
D
Dorian Reineccius
Deutsches Zentrum für Luft- und Raumfahrt, Germany
Heiko Röglin
Heiko Röglin
Professor of Computer Science, University of Bonn
Theoretical Computer ScienceSmoothed AnalysisAlgorithmic Game Theory
Melanie Schmidt
Melanie Schmidt
Heinrich-Heine-Universität Düsseldorf
Approximation AlgorithmsClusteringGeometric Data Analysis