🤖 AI Summary
This paper studies community detection under the symmetric $k$-random block model with adversarial node contamination. Addressing the low signal-to-noise regime near the Kesten–Stigum (KS) threshold, we propose the first polynomial-time algorithm achieving minimax-optimal misclassification rate $expig(-(1pm o(1))C/kig)$ under the mild sample complexity condition $C geq K k^2 log k$. Our method integrates sum-of-squares programming with robust majority voting and introduces a novel graph bisection subroutine, significantly enhancing tolerance to adversarial corruption—up to $expig(-(1pm o(1))C/kig)$ fraction of contaminated nodes. In contrast to prior work, our algorithm breaks through both the strong-signal assumption and high computational complexity barriers, achieving, for the first time near the KS threshold, simultaneous statistical optimality, polynomial-time solvability, and strong robustness against adversarial perturbations.
📝 Abstract
We study community detection in the emph{symmetric $k$-stochastic block model}, where $n$ nodes are evenly partitioned into $k$ clusters with intra- and inter-cluster connection probabilities $p$ and $q$, respectively.
Our main result is a polynomial-time algorithm that achieves the minimax-optimal misclassification rate
egin{equation*}
exp Bigl(-igl(1 pm o(1)igr) frac{C}{k}Bigr),
quad ext{where } C = (sqrt{pn} - sqrt{qn})^2,
end{equation*}
whenever $C ge K,k^2,log k$ for some universal constant $K$, matching the Kesten--Stigum (KS) threshold up to a $log k$ factor.
Notably, this rate holds even when an adversary corrupts an $ηle expigl(- (1 pm o(1)) frac{C}{k}igr)$ fraction of the nodes.
To the best of our knowledge, the minimax rate was previously only attainable either via computationally inefficient procedures [ZZ15] or via polynomial-time algorithms that require strictly stronger assumptions such as $C ge K k^3$ [GMZZ17].
In the node-robust setting, the best known algorithm requires the substantially stronger condition $C ge K k^{102}$ [LM22].
Our results close this gap by providing the first polynomial-time algorithm that achieves the minimax rate near the KS threshold in both settings.
Our work has two key technical contributions:
(1) we robustify majority voting via the Sum-of-Squares framework,
(2) we develop a novel graph bisection algorithm via robust majority voting, which allows us to significantly improve the misclassification rate to $1/mathrm{poly}(k)$ for the initial estimation near the KS threshold.