๐ค AI Summary
This paper studies community recovery in the stochastic block model (SBM) when the number of communities $ q $ diverges with the number of vertices $ n $ (i.e., $ q o infty $), departing from the classical fixed-$ q $ regime. Methodologically, it employs low-degree polynomial algorithms, non-backtracking random walks, and graph-theoretic statistical inference to precisely characterize both the information-theoretic limits and computational feasibility thresholds. Key contributions include: (i) establishing that the KestenโStigum (KS) threshold remains a sharp hardness barrier for low-degree algorithms when $ q ll sqrt{n} $; (ii) demonstrating that exact recovery in polynomial time becomes possible beyond the KS threshold when $ q gg sqrt{n} $; and (iii) identifying, for the first time under diverging $ q $, the detectability threshold (up to constant factors) and the information-theoretic exact recovery threshold. The work further proposes a novel phase transition conjecture, providing foundational theory for statistical inference on high-dimensional sparse graph structures.
๐ Abstract
We study the inference of communities in stochastic block models with a growing number of communities. For block models with $n$ vertices and a fixed number of communities $q$, it was predicted in Decelle et al. (2011) that there are computationally efficient algorithms for recovering the communities above the Kesten--Stigum (KS) bound and that efficient recovery is impossible below the KS bound. This conjecture has since stimulated a lot of interest, with the achievability side proven in a line of research that culminated in the work of Abbe and Sandon (2018). Conversely, recent work by Sohn and Wein (2025) provides evidence for the hardness part using the low-degree paradigm. In this paper we investigate community recovery in the regime $q=q_n o infty$ as $n oinfty$ where no such predictions exist. We show that efficient inference of communities remains possible above the KS bound. Furthermore, we show that recovery of block models is low-degree hard below the KS bound when the number of communities satisfies $qll sqrt{n}$. Perhaps surprisingly, we find that when $q gg sqrt{n}$, there is an efficient algorithm based on non-backtracking walks for recovery even below the KS bound. We identify a new threshold and ask if it is the threshold for efficient recovery in this regime. Finally, we show that detection is easy and identify (up to a constant) the information-theoretic threshold for community recovery as the number of communities $q$ diverges. Our low-degree hardness results also naturally have consequences for graphon estimation, improving results of Luo and Gao (2024).