🤖 AI Summary
This work investigates the polynomial-time solvability of community detection in the stochastic block model (SBM) when the number of communities $K geq sqrt{n}$, focusing on the computational phase transition in the moderately sparse regime. To overcome the failure of standard spectral methods in this high-density setting, we propose a novel algorithm based on structured motif counting, integrating non-backtracking path analysis and probabilistic combinatorics. We rigorously characterize the sharp computational threshold for nontrivial community recovery. Theoretically, we prove that motif counting achieves efficient recovery in polynomial time above the recently proposed threshold by Chin et al., while a lower bound for low-degree polynomials establishes the tightness of this threshold. This is the first density-adaptive solvability characterization for $K geq sqrt{n}$, closing a fundamental theoretical gap in multi-community, high-density SBMs, and revealing that optimal algorithms fundamentally diverge from spectral approaches.
📝 Abstract
A fundamental theoretical question in network analysis is to determine under which conditions community recovery is possible in polynomial time in the Stochastic Block Model (SBM). When the number $K$ of communities remains smaller than $sqrt{n}$ --where $n$ denotes the number of nodes--, non-trivial community recovery is possible in polynomial time above, and only above, the Kesten--Stigum (KS) threshold, originally postulated using arguments from statistical physics.
When $K geq sqrt{n}$, Chin, Mossel, Sohn, and Wein recently proved that, in the emph{sparse regime}, community recovery in polynomial time is achievable below the KS threshold by counting non-backtracking paths. This finding led them to postulate a new threshold for the many-communities regime $K geq sqrt{n}$. Subsequently, Carpentier, Giraud, and Verzelen established the failure of low-degree polynomials below this new threshold across all density regimes, and demonstrated successful recovery above the threshold in certain moderately sparse settings. While these results provide strong evidence that, in the many community setting, the computational barrier lies at the threshold proposed in~Chin et al., the question of achieving recovery above this threshold still remains open in most density regimes.
The present work is a follow-up to~Carpentier et al., in which we prove Conjecture~1.4 stated therein by: \ 1- Constructing a family of motifs satisfying specific structural properties; and\ 2- Proving that community recovery is possible above the proposed threshold by counting such motifs.\ Our results complete the picture of the computational barrier for community recovery in the SBM with $K geq sqrt{n}$ communities. They also indicate that, in moderately sparse regimes, the optimal algorithms appear to be fundamentally different from spectral methods.