A Combinatorial Algorithm for Weighted Correlation Clustering

📅 2023-10-14

📈 Citations: 0

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This paper studies the Weighted Correlation Clustering problem: given a vertex set and, for each pair of vertices, distinct weights representing disagreement and agreement, the goal is to partition the vertices into clusters so as to minimize the total weight of intra-cluster disagreements plus inter-cluster agreements. As this problem is NP-hard, we propose the first randomized combinatorial algorithm that simultaneously achieves efficiency and optimal approximation guarantees: a 3-approximation in general graphs, and—under the triangle inequality—a 1.6-approximation, matching the current best theoretical bound. Our algorithm runs in O(n²) time, making it the fastest known polynomial-time algorithm for this problem. The method integrates randomized edge cutting, graph clustering modeling, and probabilistic constraint analysis. Notably, it is the first to systematically exploit the triangle inequality structure in correlation clustering approximation algorithms, thereby achieving breakthroughs in both theoretical accuracy and computational efficiency.

📝 Abstract

This article introduces a quick and simple combinatorial approximation algorithm for the weighted correlation clustering problem. In this problem, we have a set of vertices and two weight values for each pair of vertices denoting their difference and similarity. The goal is to cluster the vertices with minimum total intra-cluster difference weights plus inter-cluster similarity weights. Our algorithm is a randomized approximation algorithm with $O(n^2)$ running time where $n$ is the number of vertices. Its approximation factor is 3 when the instance satisfies probability constraints. If the instance satisfies triangle inequality in addition to probability constraints, the approximation factor is 1.6. Both algorithms are superior to the best known results in terms of running time and the second one is also superior in terms of the approximation factor.

Problem

Research questions and friction points this paper is trying to address.

Develops a fast algorithm for weighted correlation clustering

Minimizes intra-cluster differences and inter-cluster similarities

Improves runtime and approximation factors over existing methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combinatorial approximation algorithm for clustering

Randomized algorithm with O(n^2) runtime

Improved approximation factors under constraints

🔎 Similar Papers

Information-Theoretic Active Correlation Clustering