Recovering Small Communities in the Planted Partition Model

📅 2025-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates the recoverability of large-scale, highly heterogeneous communities—particularly those following a power-law size distribution—in the Planted Partition Model (PPM). When the number of communities is arbitrarily large and their sizes are severely imbalanced, conventional accuracy- or alignment-based evaluation metrics become inadequate. To address this, the authors replace such metrics with the correlation coefficient, enabling a unified formalization of exact, approximate, and weak recovery. They propose Diamond Percolation, a novel percolation algorithm based on common neighbors. Theoretically, this work provides the first rigorous recovery guarantees for power-law-sized communities under mild edge probability assumptions, ensuring reliable recovery for an arbitrary number of communities across vastly different scales. This significantly enhances the model’s capacity to capture multiscale structures prevalent in real-world networks.

Technology Category

Application Category

📝 Abstract
We analyze community recovery in the planted partition model (PPM) in regimes where the number of communities is arbitrarily large. We examine the three standard recovery regimes: exact recovery, almost exact recovery, and weak recovery. When communities vary in size, traditional accuracy- or alignment-based metrics become unsuitable for assessing the correctness of a predicted partition. To address this, we redefine these recovery regimes using the correlation coefficient, a more versatile metric for comparing partitions. We then demonstrate that emph{Diamond Percolation}, an algorithm based on common-neighbors, successfully recovers communities under mild assumptions on edge probabilities, with minimal restrictions on the number and sizes of communities. As a key application, we consider the case where community sizes follow a power-law distribution, a characteristic frequently found in real-world networks. To the best of our knowledge, we provide the first recovery results for such unbalanced partitions.
Problem

Research questions and friction points this paper is trying to address.

Recovering communities in PPM with arbitrarily large numbers
Redefining recovery regimes using correlation coefficient for accuracy
Applying Diamond Percolation to power-law distributed community sizes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses correlation coefficient for partition comparison
Employs Diamond Percolation algorithm
Applies to power-law community size distributions
🔎 Similar Papers
No similar papers found.
M
Martijn Gosgens
Centrum Wiskunde & Informatica (CWI) Amsterdam
Maximilien Dreveton
Maximilien Dreveton
EPFL
Statistical Network AnalysisCommunity DetectionClusteringRandom graphs