Stochastic block models with many communities and the Kesten--Stigum bound

๐Ÿ“… 2025-03-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper studies community recovery in the stochastic block model (SBM) when the number of communities $ q $ diverges with the number of vertices $ n $ (i.e., $ q o infty $), departing from the classical fixed-$ q $ regime. Methodologically, it employs low-degree polynomial algorithms, non-backtracking random walks, and graph-theoretic statistical inference to precisely characterize both the information-theoretic limits and computational feasibility thresholds. Key contributions include: (i) establishing that the Kestenโ€“Stigum (KS) threshold remains a sharp hardness barrier for low-degree algorithms when $ q ll sqrt{n} $; (ii) demonstrating that exact recovery in polynomial time becomes possible beyond the KS threshold when $ q gg sqrt{n} $; and (iii) identifying, for the first time under diverging $ q $, the detectability threshold (up to constant factors) and the information-theoretic exact recovery threshold. The work further proposes a novel phase transition conjecture, providing foundational theory for statistical inference on high-dimensional sparse graph structures.

Technology Category

Application Category

๐Ÿ“ Abstract
We study the inference of communities in stochastic block models with a growing number of communities. For block models with $n$ vertices and a fixed number of communities $q$, it was predicted in Decelle et al. (2011) that there are computationally efficient algorithms for recovering the communities above the Kesten--Stigum (KS) bound and that efficient recovery is impossible below the KS bound. This conjecture has since stimulated a lot of interest, with the achievability side proven in a line of research that culminated in the work of Abbe and Sandon (2018). Conversely, recent work by Sohn and Wein (2025) provides evidence for the hardness part using the low-degree paradigm. In this paper we investigate community recovery in the regime $q=q_n o infty$ as $n oinfty$ where no such predictions exist. We show that efficient inference of communities remains possible above the KS bound. Furthermore, we show that recovery of block models is low-degree hard below the KS bound when the number of communities satisfies $qll sqrt{n}$. Perhaps surprisingly, we find that when $q gg sqrt{n}$, there is an efficient algorithm based on non-backtracking walks for recovery even below the KS bound. We identify a new threshold and ask if it is the threshold for efficient recovery in this regime. Finally, we show that detection is easy and identify (up to a constant) the information-theoretic threshold for community recovery as the number of communities $q$ diverges. Our low-degree hardness results also naturally have consequences for graphon estimation, improving results of Luo and Gao (2024).
Problem

Research questions and friction points this paper is trying to address.

Investigates community recovery in stochastic block models with increasing communities.
Determines efficient inference above the Kesten-Stigum bound for large communities.
Explores low-degree hardness and efficient algorithms below the Kesten-Stigum bound.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient community recovery above KS bound
Non-backtracking walks for recovery below KS
New threshold for efficient recovery identified
๐Ÿ”Ž Similar Papers
No similar papers found.