🤖 AI Summary
This paper studies the min-max correlation clustering problem on complete graphs with ± edge labels: partition vertices into clusters to minimize the ℓ∞-norm of each vertex’s disagreement vector—the number of positive/negative edge violations with respect to each cluster. We present the first (3+ε)-approximation algorithm, breaking the prior 4-approximation barrier. Leveraging a structural insight—local similarity among optimal neighborhoods—we design a purely combinatorial algorithm enabling neighborhood-query-driven linear-time computation. Furthermore, we initiate the study of this problem in modern large-scale computational models: in the Massively Parallel Computation (MPC) model, our algorithm uses sublinear memory per machine and terminates in O(1) rounds; in the semi-streaming model, it achieves Õ(|V|) space complexity. The overall time complexity is Õ(|E⁺|), significantly improving upon previous methods.
📝 Abstract
We present an efficient algorithm for the min-max correlation clustering problem. The input is a complete graph where edges are labeled as either positive $(+)$ or negative $(-)$, and the objective is to find a clustering that minimizes the $ell_{infty}$-norm of the disagreement vector over all vertices. We resolve this problem with an efficient $(3 + epsilon)$-approximation algorithm that runs in nearly linear time, $ ilde{O}(|E^+|)$, where $|E^+|$ denotes the number of positive edges. This improves upon the previous best-known approximation guarantee of 4 by Heidrich, Irmai, and Andres, whose algorithm runs in $O(|V|^2 + |V| D^2)$ time, where $|V|$ is the number of nodes and $D$ is the maximum degree in the graph. Furthermore, we extend our algorithm to the massively parallel computation (MPC) model and the semi-streaming model. In the MPC model, our algorithm runs on machines with memory sublinear in the number of nodes and takes $O(1)$ rounds. In the streaming model, our algorithm requires only $ ilde{O}(|V|)$ space, where $|V|$ is the number of vertices in the graph. Our algorithms are purely combinatorial. They are based on a novel structural observation about the optimal min-max instance, which enables the construction of a $(3 + epsilon)$-approximation algorithm using $O(|E^+|)$ neighborhood similarity queries. By leveraging random projection, we further show these queries can be computed in nearly linear time.